This repo is the development of Steeve's NLP model. The complete product is in Steeve_bot repo.
In order to get a better performance, we want to train a classifier to predict user's suitable field. Therefore, here we used SVM to train our model. First, we utilize a filter to retrieve programming languages, and then use these extracted programming languages as features, and the label is the field the post is from.
- Retrieve programming languages(PLs) from the posts.
- Convert these PLs into 300 dimension vectors.
- Sum PLs from the same post as its feature, and label is the field which the post is from.
- Put data into SVM.
- Retrieve PLs from user's input.
- Convert these PLs into 300 dimension vectors.
- Sum up these vectors.
- Put into SVM model to predict field.
- Rule.txt - contains the keywords
- candidates_of_keyword.py - the core of NLP model in charge of connecting with server/database.
- modules.py - contains reusable functions, like utilities.
- SVM.py and TFIDF.py - are class for singleton usage, only one instance in the whole system.
- Chia Fang Ho - KellyHO
- Wen Bin Han - HanVincent