- I am Kate.
- I am currently studying at Higher School of Economics.
- Major: Fundamental and Applied Linguistics.
- Minor: Intellectual Data Analysis.
- Interested in Data Science, Machine Learning and Linguistics.
- Works aimed at data analysis and application of various machine learning algorithms: supervised learning, regressions, clustering (including self-made implementation of KMeans algorithm), Exploratory Data Analysis, self-made Perceptron, notebooks from Kaggle. EPAM Data Science course and EPAM Advanced NLP course are also available here.
- The course work project in colaboration with Julia dedicated to the topic "Embedding in the Study of Idioms". At the moment, the data collection stage has been completed. The repository contains .dsl files of five dictionaries, parsers designed to convert them into json format and correspondent json files as a result. Some analysis of the data is also there.
- Concordance maker makes concordances for a given word based on a frequency dictionary.
- Language identifier identifies the language of a given text (eng | ger).
- Plagiarism estimator compares two texts and gives plagiarism % based on the longest common subsequences.
- Text generator generates text based on another given english text (using Maximum likelihood estimation or back_off model).
- Simple phonebook manager, where I learned how to work with simple SQL queries.
- Study project for exploring basic algorithms.