When working with roman-Urdu one of the main difficulty faced by people is that they can't use many off-the-shelf tools for roman-Urdu so In this repository, I show how can we create word embeddings using some raw roman-Urdu text.
This is just to show how it can be done but for better results, it's recommended you acquire more data (scrapping is probably the way to go).
notebook: https://github.com/BahramKBaloch/RomanUrduEmbeddings/blob/main/FastText_Example_Roman_Urdu.ipynb dataset: https://github.com/BahramKBaloch/RomanUrduEmbeddings/blob/main/roman-urdu-dataset.txt repo: https://github.com/BahramKBaloch/RomanUrduEmbeddings
special thanks to Hamza Khan for the partial data.
Better pre-proccessing to avoid data loss.