This is my final project for the Text Mining course at Harbour.Space University, 2022.
There are a only a few specific categories of algorithms.
- Libraries used: spacy, nltk, numpy, sklearn, streamlit
- Uses TFIDF vectorizer to vectorize documents.
- Uses KDtrees to index the vectors and execute queries faster.
- Uses Levenshtein Distance and Longest Common Prefix to find the closest words to the words given in the input.
- Clone this repo (for help see this tutorial).
- Unzip the file
wiki_data.zipZIP file. - Make sure to install streamlit in your environment.
- Run from your terminal
python -m spacy download en_core_web_sm. - Run
streamlit run app.pyto start the app. It should open a tab in your browser automatically, where you can search for different algorithms and you'll receive wikipedia links.
- Email: anier.velasco@gmail.com
- Telegram: https://t.me/aniervs
- Github: https://github.com/aniervs
- LinkedIn: https://www.linkedin.com/in/aniervs
