Skip to content

Commit

Permalink
add telugu support
Browse files Browse the repository at this point in the history
  • Loading branch information
Shubhamjain27 committed Oct 11, 2020
1 parent a6b7051 commit 035d447
Show file tree
Hide file tree
Showing 5 changed files with 16 additions and 1 deletion.
Binary file added .DS_Store
Binary file not shown.
2 changes: 2 additions & 0 deletions README.md
Expand Up @@ -31,6 +31,7 @@ Checkout detailed docs along with Installation instructions
| Nepali | ne |
| Sanskrit | sa |
| English | en |
| Telugu | te |

#### Code Mixed languages

Expand All @@ -56,6 +57,7 @@ Checkout detailed docs along with Installation instructions
| Sanskrit | [NLP for Sanskrit](https://github.com/goru001/nlp-for-sanskrit) | [Sanskrit Wikipedia Articles](https://www.kaggle.com/disisbig/sanskrit-wikipedia-articles) | ~6 | ~3 | [Sanskrit Shlokas Dataset](https://www.kaggle.com/disisbig/sanskrit-shlokas-dataset) | 84.3 (valid set) | | | [Sanskrit Embeddings projection](https://projector.tensorflow.org/?config=https://raw.githubusercontent.com/goru001/nlp-for-sanskrit/master/language-model/embedding_projector_config.json) | [Sanskrit Embeddings projection](https://projector.tensorflow.org/?config=https://raw.githubusercontent.com/goru001/nlp-for-sanskrit/master/language-model/embedding_projector_transformer_config.json) |
| Nepali | [NLP for Nepali](https://github.com/goru001/nlp-for-nepali) | [Nepali Wikipedia Articles](https://www.kaggle.com/disisbig/nepali-wikipedia-articles) | 31.5 | 29.3 | [Nepali News Dataset](https://www.kaggle.com/disisbig/nepali-news-dataset) | 98.5 (valid set) | | | [Nepali Embeddings projection](https://projector.tensorflow.org/?config=https://raw.githubusercontent.com/goru001/nlp-for-nepali/master/language-model/embedding_projector_config.json) | [Nepali Embeddings projection](https://projector.tensorflow.org/?config=https://raw.githubusercontent.com/goru001/nlp-for-nepali/master/language-model/embedding_projector_transformer_config.json) |
| Urdu | [NLP for Urdu](https://github.com/anuragshas/nlp-for-urdu) | [Urdu Wikipedia Articles](https://www.kaggle.com/disisbig/urdu-wikipedia-articles) | 13.19 | 12.55 | [Urdu News Dataset](https://www.kaggle.com/disisbig/urdu-news-dataset) | 95.28 (valid set) | | | [Urdu Embeddings projection](https://projector.tensorflow.org/?config=https://raw.githubusercontent.com/anuragshas/nlp-for-urdu/master/language-model/embedding_projector_config.json) | [Urdu Embeddings projection](https://projector.tensorflow.org/?config=https://raw.githubusercontent.com/anuragshas/nlp-for-urdu/master/language-model/embedding_projector_transformer_config.json) |
| Telugu | [NLP for Telugu](https://github.com/Shubhamjain27/nlp-for-telugu) | [Telugu Wikipedia Articles](https://www.kaggle.com/shubhamjain27/telugu-wikipedia-articles) | 27.47 | 29.44 | [Telugu News Dataset](https://www.kaggle.com/shubhamjain27/telugu-news-articles)<br><br><br>[Telugu News Andhra Jyoti](https://www.kaggle.com/shubhamjain27/telugu-newspaperdata) | 95.4<br><br><br>92.09 | | [Notebook](https://github.com/Shubhamjain27/nlp-for-telugu/tree/master/classification/Telugu_Classification_Model.ipynb) <br><br><br>[Notebook](https://github.com/Shubhamjain27/nlp-for-telugu/tree/master/classification/Telugu_news_classification_Andhra_Jyoti.ipynb) | [Telugu Embeddings projection](https://projector.tensorflow.org/?config=https://raw.githubusercontent.com/Shubhamjain27/nlp-for-telugu/master/language-model/embedding_projector_config.json) | [Telugu Embeddings projection](https://projector.tensorflow.org/?config=https://raw.githubusercontent.com/Shubhamjain27/nlp-for-telugu/master/language-model/embedding_projector_transformer_config.json) |
| Tanglish | [NLP for Tanglish](https://github.com/goru001/nlp-for-tanglish) | [Synthetic Tanglish Dataset](https://drive.google.com/drive/folders/1M4Sx_clF0iP1y-JG3OhfacFKTDoHXCR1?usp=sharing) | 37.50 | - | Dravidian Codemix HASOC @ FIRE 2020<br><br>Dravidian Codemix Sentiment Analysis @ FIRE 2020 | F1 Score: 0.88<br><br>F1 Score: 0.62 | - | [Notebook](https://github.com/goru001/nlp-for-tanglish/blob/master/classification/classification_model_hasoc.ipynb)<br><br>[Notebook](https://github.com/goru001/nlp-for-tanglish/blob/master/classification/classification_model_dc_fire.ipynb) | [Tanglish Embeddings Projection](https://projector.tensorflow.org/?config=https://raw.githubusercontent.com/goru001/nlp-for-tanglish/master/language-model/embedding_projector_config.json) | - |
| Manglish | [NLP for Manglish](https://github.com/goru001/nlp-for-manglish) | [Synthetic Manglish Dataset](https://drive.google.com/drive/folders/1M4Sx_clF0iP1y-JG3OhfacFKTDoHXCR1?usp=sharing) | 45.84 | - | Dravidian Codemix HASOC @ FIRE 2020<br><br>Dravidian Codemix Sentiment Analysis @ FIRE 2020 | F1 Score: 0.74<br><br>F1 Score: 0.69 | - | [Notebook](https://github.com/goru001/nlp-for-manglish/blob/master/classification/classification_model_hasoc.ipynb)<br><br>[Notebook](https://github.com/goru001/nlp-for-manglish/blob/master/classification/classification_model_dc_fire.ipynb) | [Manglish Embeddings Projection](https://projector.tensorflow.org/?config=https://raw.githubusercontent.com/goru001/nlp-for-manglish/master/language-model/embedding_projector_config_latin_script.json) | - |
| Hinglish | [NLP for Hinglish](https://github.com/goru001/nlp-for-hinglish) | [Synthetic Hinglish Dataset](https://www.dropbox.com/sh/as5fg8jsrljt6k7/AADnSLlSNJPeAndFycJGurOUa?dl=0) | 86.48 | - | - | - | - | - | [Hinglish Embeddings Projection](https://projector.tensorflow.org/?config=https://raw.githubusercontent.com/goru001/nlp-for-hinglish/main/language_model/embedding_projector_config.json) | - |
Expand Down

0 comments on commit 035d447

Please sign in to comment.