Skip to content

searayeah/avito-category-competition

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 

Repository files navigation

avito-category-competition

Text classification competition: Avito Category Prediction

Features:

  • Products name (Russian text)
  • Description (Russian text)

Target:

  • Category (50 classes)

Text preprocessing:

  • remove punctuation and extra symbols
  • lowercase
  • lemmatize using PyMystem3
  • remove stopwords using NLTK
  • remove short words with length < 3

Text embedding: TfidfVectorizer(ngram_range=(1, 2))

Model: SGDClassifier(n_jobs=-1, alpha=0.0000002, tol=1e-4)

Tuned the hyper-parameters using Grid Search

Got second place with accuracy=0.91686

About

Text classification competition solution

Resources

Stars

Watchers

Forks