BertModel_Classification

A language model is a probability distribution over sequences of words. Given such a sequence of length m, a language model assigns a probability P to the whole sequence. Language models generate probabilities by training on text corpora in one or many languages.

Here, I have coded them both in .py and .ipynb format for better understanding of both formats.

Bert model:

Bidirectional Encoder Representations from Transformers is a transformer-based machine learning technique for natural language processing pre-training developed by Google.

The techniques we use:

Setup: import packages, read data, Preprocessing, Partitioning.

Bag-of-Words: Feature Engineering & Feature Selection & Machine Learning with scikit-learn, Testing & Evaluation, Explainability with lime.

Word Embedding: Fitting a Word2Vec with gensim, Feature Engineering, Hybrid model with LSTM layer and Deep Learning with tensorflow/keras, Testing & Evaluation.

Language Models: Feature Engineering with transformers, Fine tunning a pre-trained BERT with transformers and tensorflow/keras, Testing & Evaluation.

The dataset:

I love superheros ! especially I am a big Marvel fan. So I want to do something interesting for language models and I picked up the superhero dataset from kaggle. You can do many things with this dataset! But here I am trying to use it for text classification. I want to see based on the history description of superheros, if I can find who their creators are!

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
Bert_classification.ipynb		Bert_classification.ipynb
Bert_classification.py		Bert_classification.py
README.md		README.md
superheroes_nlp_dataset.csv		superheroes_nlp_dataset.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BertModel_Classification

Bert model:

The dataset:

About

Releases

Packages

Languages

SangamithraPanneerSelvam/BertModel_Classification

Folders and files

Latest commit

History

Repository files navigation

BertModel_Classification

Bert model:

The dataset:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages