Base programm of preprocessing consists of tokenization, removing stop words and punctuation, lemmatization. Def was build with the help of spaCy - a library for advanced Natural Language Processing in Python. Use the link to read more about spaCy usage. Alternative version was build with nltk package.