Skip to content

Latest commit

 

History

History
13 lines (9 loc) · 472 Bytes

README.md

File metadata and controls

13 lines (9 loc) · 472 Bytes

Hindi-NER

This is an example project to let you use your dataset and publish it on hugginface.

The dataset used in this project is IJNLP 2008 Hindi dataset. I have converted to word tag format, separated by tab.

Now split dataset into train, test and validation. You can choose whatever percentange you want for your dataset.

python split_hindi.py

Now you can publish your dataset by adding token and your account info in publish_hindi.py script.