Skip to content

Commit

Permalink
add trained_folder note
Browse files Browse the repository at this point in the history
  • Loading branch information
minimaxir committed Apr 18, 2021
1 parent e2f5d5d commit 7c7888a
Show file tree
Hide file tree
Showing 2 changed files with 4 additions and 5 deletions.
2 changes: 1 addition & 1 deletion README.md
Expand Up @@ -82,7 +82,7 @@ ai = aitextgen(tokenizer_file=tokenizer_file, config=config)
# which automatically processes the dataset with the appropriate size.
data = TokenDataset(file_name, tokenizer_file=tokenizer_file, block_size=64)

# Train the model! It will save pytorch_model.bin periodically and after completion.
# Train the model! It will save pytorch_model.bin periodically and after completion to the `trained_model` folder.
# On a 2020 8-core iMac, this took ~25 minutes to run.
ai.train(data, batch_size=8, num_steps=50000, generate_every=5000, save_every=5000)

Expand Down
7 changes: 3 additions & 4 deletions notebooks/training_hello_world.ipynb
Expand Up @@ -131,7 +131,7 @@
},
{
"source": [
"Train the model! It will save pytorch_model.bin periodically and after completion. On a 2020 8-core iMac, this took ~25 minutes to run.\n",
"Train the model! It will save pytorch_model.bin periodically and after completion to the `trained_model` folder. On a 2020 8-core iMac, this took ~25 minutes to run.\n",
"\n",
"The configuration below processes 400,000 subsets of tokens (8 * 50000), which is about just one pass through all the data (1 epoch). Ideally you'll want multiple passes through the data and a training loss less than `2.0` for coherent output; when training a model from scratch, that's more difficult, but with long enough training you can get there!"
],
Expand Down Expand Up @@ -317,9 +317,8 @@
"metadata": {},
"outputs": [],
"source": [
"ai2 = aitextgen(model=\"trained_model/pytorch_model.bin\",\n",
" tokenizer_file=\"aitextgen.tokenizer.json\",\n",
" config=\"trained_model/config.json\")"
"ai2 = aitextgen(model_folder=\"trained_model\",\n",
" tokenizer_file=\"aitextgen.tokenizer.json\")"
]
},
{
Expand Down

0 comments on commit 7c7888a

Please sign in to comment.