Skip to content

Commit

Permalink
Merge pull request #25 from bigcode-project/loubnabnl-patch-2
Browse files Browse the repository at this point in the history
Update README.md
  • Loading branch information
loubnabnl committed Nov 28, 2022
2 parents 948e4f5 + ed5432c commit 8a678a1
Showing 1 changed file with 4 additions and 1 deletion.
5 changes: 4 additions & 1 deletion README.md
Expand Up @@ -7,4 +7,7 @@ necessary used for model training.

- `language_selection`: notebooks and file with language to file extensions mapping used to build the Stack v1.1.
- `pii`: code for running PII detection and anonymization on code datasets.
- `preprocessing`: code for filtering code datasets based on line length and percentage of alphanumeric characters.
- `preprocessing`: code for filtering code datasets based on:
- line length and percentage of alphanumeric characters.
- number of stars.
- comments to code ratio.

0 comments on commit 8a678a1

Please sign in to comment.