**Copyright 2021 Antoine SIMOULIN.**

Licensed under the Apache License, Version 2.0 (the "License");

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

https://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

# Using GPT-fr 🇫🇷

<img src="https://raw.githubusercontent.com/AntoineSimoulin/gpt-fr/main/imgs/logo.png" alt="GPT-fr logo" width="200">

**GPT-fr** is a French GPT model for French developped by [Quantmetry](https://www.quantmetry.com/) and the [Laboratoire de Linguistique Formelle (LLF)](http://www.llf.cnrs.fr/en).

If you're opening this Notebook on colab, you will probably need to install 🤗 Transformers and 🤗 Tokenizers. You may also change the hardware to **GPU** since all computation will be much faster.

In [1]:
%%capture
!pip install git+https://github.com/huggingface/transformers.git
!pip install tokenizers

## Requirements

In [1]:
import torch
import transformers
from transformers import GPT2Tokenizer, GPT2LMHeadModel

In [2]:
# Check GPU is available and libraries version
print('Pytorch version ...............{}'.format(torch.__version__))
print('Transformers version ..........{}'.format(transformers.__version__))
print('GPU available .................{}'.format('\u2705' if torch.cuda.device_count() > 0 else '\u274c'))
print('Available devices .............{}'.format(torch.cuda.device_count()))
print('Active CUDA Device: ...........{}'.format(torch.cuda.current_device()))
print('Current cuda device: ..........{}'.format(torch.cuda.current_device()))

Pytorch version ...............1.9.0+cu102
Transformers version ..........4.9.0.dev0
GPU available .................✅
Available devices .............1
Active CUDA Device: ...........0
Current cuda device: ..........0


## Loading the model

In [30]:
# Query GPU memory used before loading the model.
if torch.cuda.is_available():
  device = torch.device('cuda:0')
else:
  device = torch.device('cpu')
memory_used_s = !nvidia-smi --query-gpu=memory.used --format=csv | grep ' MiB'
memory_used_s = int(memory_used_s[0][:-4])

In [31]:
# Load pretrained model and tokenizer.
# The model will be downloaded from HuggingFace hub and cached.
# It may take ~5 minutes for the first excecution.

model = GPT2LMHeadModel.from_pretrained("asi/gpt-fr-cased-base").to(device)
tokenizer = GPT2Tokenizer.from_pretrained("asi/gpt-fr-cased-base")
tokenizer.add_special_tokens({
  "eos_token": "</s>",
  "bos_token": "<s>",
  "unk_token": "<unk>",
  "pad_token": "<pad>",
  "mask_token": "<mask>"
})

0

In [15]:
# Query GPU memory used after loading the model.
memory_used_e = !nvidia-smi --query-gpu=memory.used --format=csv | grep ' MiB'
memory_used_e = int(memory_used_e[0][:-4])
print("Model loaded in GPU memory and uses {:.2f} Go GPU RAM.".format(float(memory_used_e - memory_used_s)/1024))

Model loaded in GPU memory and uses **4.82** Go GPU RAM.


In [8]:
# Check number of parameters.
print("Model has {:,} parameters.".format(model.num_parameters()))

Model has 1,016,841,728 parameters.


In [22]:
# Set model in eval mode (do not apply dropout)
model = model.eval()

## Generation parameters

In [32]:
#@markdown Options for the `model.generate` method. c.f. documentationn <a href="https://huggingface.co/transformers/main_classes/model.html#transformers.generation_utils.GenerationMixin.generate" target="_blank">here</a>.

max_length = 200  #@param {type: "slider", min: 100, max: 1024}
do_sample = True  #@param {type: "boolean"}
top_k = 50  #@param {type: "number"}
top_p = 0.95  #@param {type: "number"}
num_return_sequences = 1    #@param {type: "number"}
#@markdown ---


In [34]:
# Generate a sample of text
# This should takes a few seconds
input_sentence = "Longtemps je me suis couché de bonne heure."
input_ids = tokenizer.encode(input_sentence, return_tensors='pt').to(device)

beam_outputs = model.generate(
    input_ids, 
    max_length=max_length, 
    do_sample=do_sample,   
    top_k=top_k, 
    top_p=top_p, 
    num_return_sequences=num_return_sequences
)

print("Output:\n" + 100 * '-')
tokenizer.decode(beam_outputs[0], skip_special_tokens=True)

Setting `pad_token_id` to `eos_token_id`:0 for open-end generation.


Output:
----------------------------------------------------------------------------------------------------


'Longtemps je me suis couché de bonne heure. Une fois de temps en temps je me levais tout simplement. Puis je me couchais et l’aurore me frappait le front sans un bruit. Je me couchais sur le côté et j’allais, à grands pas, de l’un à l’autre bout du cimetière. Je n’avais pas de lampe, et, sous l’ombre des arbres qui, à présent, se drapent de leurs feuilles mortes, je voyais passer devant moi la procession des âmes que les rues et la ville avaient faites entrer dans l’oubli. De temps en temps, je m’asseyais sur un banc, dans une attitude de méditation silencieuse. Les gens de mon village étaient dans l’impossibilité de se dire combien ils avaient été bons pour moi, car ils avaient toujours eu l’habitude de m’entendre louer leur piété, comme ils l’avaient fait pour moi'