Skip to content
This repo specifically set up for students of Computer Engineering Department at University of Guilan studying NLP
Jupyter Notebook Python
Branch: master
Clone or download
Latest commit 8c44bda Dec 31, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
Week1 add images folder into Week1 Oct 19, 2019
Week2 Week 2- Neural Network has been added to repo Nov 22, 2019
Week3 Week 3- Semantic Vector-TF-IDF has been added to repo Dec 5, 2019
Week4
NLP981_Phase1.ipynb 1st Phase of Final Project for NLP981 Dec 7, 2019
README.md Update README.md Dec 31, 2019
_config.yml Set theme jekyll-theme-modernist Dec 13, 2019

README.md

Natural Language Processing 981

Course Information

  • Name: Natural Language Processing
  • Code: 981
  • Fall 2019, Wednesday 3:00pm – 6:15pm
  • 15, Engineering Faculty, University of Guilan
  • Credits: 3.0
  • Lecturer: Javad PourMostafa
  • TA: Parsa Abbasi
  • NB: Drop me a line to get the slides!

Theoretical Sessions

  • Week 1: Introduction to NLP

    Main Approaches (Rule-based, Probabilistic Models, Traditional ML algorithms and Neural Networks), Confusion Matrix, Semantic Slot Filling, NLP Pyramid, Scenario of Text Classification, Gradient Descent, Tokenization, Normalization, Stemmer, Lemmatizer, BoW, N-Grams, TF-IDF, Binary Logistic Regression, Hashing Features for word representation, Neural Vectorization, 1-D Convolutional Layer

  • Week 2: Language Modeling

    Chain Rule, Probability of a sequence of words, Markov assumption, Bi-gram, Maximum Likelihood Estimation (MLE), Generative Model, Evaluating Language Models (Extrinsic, Intrinsic), Perplexity, Smoothing (discounting), Laplace Smoothing (Add-one, Add-k), Stupid/Katz Backoff, Kneser-Ney Smoothing

  • Week 3: Hidden Markov Model for Sequence Labeling (POS)

    Sequence Labeling, Markov Model Scenario, Markov Chain Model, Emission and Transition Probabilities HMM for POS tagging, Text generation in HMM, Training HMM, Viterbi Algorithm, Using Dynamic Programming for backtracing

  • Week 4: Neural Language Models

    Curse of Dimensionality, Distributed Representation, Neuron, Activation Functions, The Perceptron, The XOR problem, Feed-Forward Neural Networks, Training Neural Networks, Loss Function, Cross-Entropy Loss, Dropout, A Neural Probabilistic Language Model, Recurrent Neural Language Models, Gated Recurrent Neural Networks, LSTM

  • Week 5: Vector Semantics And Embeddings

    Word Similarities, Embeddings, Term-Document matrix, Document-Term matrix, Term-Context matrix, Visualizing Document Vectors, Word Window, Reminders from Linear Algebra, Cosine Computing Similarity, Pointwise Mutual Information (PMI), Dense Embeddings sources , Word2vec, Skip-gram Algorithm, Skip-Gram with Negative sampling

  • Week 6: Topic Modeling

    Probabilistic Latent Semantic Analysis (PLSA), Problem of probability of density estimation, MLE, Expectation-Maximization Algorithm, Using MLE in PLSA, Using EM in PLSA, E-Step in PLSA, M-Step in PLSA

Practical Sessions

  • Week 1: Supervised Classification

    Confusion Matrix, Whisker Plot, Using Supervised Models like Logistic Regression, Decision Tree, and so on.

  • Week 2: Introduction to Neural Networks

    Feedforward NN, Using MSE as a loss function, Updating weights in backpropagation, Gradient Descent Algorithm.

  • Week 3: Vector Semantics and Embeddings

    Bag of Words, Finding Unique Words, Creating Document-Word Matrix, TF-IDF Computation Finally, investigate our naive TF-IDF model in comparison with SKlearn TfidfVectorizer

  • Week 4: Word2Vec (SkipGram Algorithm)

    Finding word similarities, Set hyperparameters, Generate training data, Fit an unsupervised model, Inference from testing samples

License

This project is licensed under the MIT License - see the LICENSE.md file for details

You can’t perform that action at this time.