Skip to content

Commit

Permalink
docs: Added link to Common Crawl's terms of use
Browse files Browse the repository at this point in the history
  • Loading branch information
pjox committed Jun 17, 2024
1 parent 05ce7fd commit a853c2a
Showing 1 changed file with 4 additions and 2 deletions.
6 changes: 4 additions & 2 deletions docs/versions/mOSCAR.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,8 @@ Paper link: [https://arxiv.org/abs/2406.08707](https://arxiv.org/abs/2406.08707)

## Language table

| Lang. name | Code | Family | Script | #documents | #images | # tokens |
| ---------------------- | -------- | ------------- | ---------- | ----------- | ----------- | -------------- |
| Lang. name | Code | Family | Script | #documents | #images | # tokens |
| ---------------------- | -------- | -------------- | ---------- | ---------- | ----------- | -------------- |
| Acehnese | ace_Latn | Austronesian | Latin | 7,803 | 32,461 | 2,889,134 |
| Mesopotamian Arabic | acm_Arab | Afro-Asiatic | Arabic | 2,274 | 10,620 | 1,047,748 |
| Tunisian Arabic | aeb_Arab | Afro-Asiatic | Arabic | 7,640 | 41,570 | 2,715,187 |
Expand Down Expand Up @@ -202,6 +202,8 @@ These data are released under this licensing scheme:
- To the extent possible under law, Inria has waived all copyright and related or neighboring rights to OSCAR.
- This work is published from: France.

Please also refer to Common Crawl's [Terms of Use](https://commoncrawl.org/terms-of-use)

## Citation
```
@article{futeral2024moscar,
Expand Down

0 comments on commit a853c2a

Please sign in to comment.