Thanks for amazing feedback on my last (which was my first) kaggle notebook.

Link: https://www.kaggle.com/snooptosh/intro-oop-in-python 

# What is Pycaret?

PyCaret is an open source, low-code machine learning library in Python that allows you to go from preparing your data to deploying your model within minutes in your choice of notebook environment.

URL: https://pycaret.org/

##  *In this notebook, I tried to explore the nlp module of pycaret library by running a topic model on Hillary Clinton's sample tweet*

## Loading and installing relevant packages

In [None]:
!pip install pycaret
!python -m spacy download en_core_web_sm
!python -m textblob.download_corpora
import os
import pandas as pd
import pycaret
from pycaret.nlp import *

## Read data

In [None]:
df_tweets = pd.read_csv("/kaggle/input/clinton-trump-tweets/tweets.csv")
df_tweets.head()

## Filter data for hillary clinton tweets 

In [None]:
# filtering data for Hillary Clinton tweets
df_tweets_hc = df_tweets[df_tweets['handle'] == "HillaryClinton"].reset_index(drop=True)
print(df_tweets_hc.shape)
df_tweets_hc.head()

## Take 1000 sample rows

In [None]:
df = df_tweets_hc.sample(1000, random_state=493).reset_index(drop=True)
print(df.shape)
df.head()

## Setting Up the Environment

In [None]:
# initialize the setup
nlp = setup(data = df, target = 'text', session_id = 493, custom_stopwords = [ 'rt', 'https', 'http', 'co', 'amp'])


## Creating the Model

In the line below, notice that w param used 'lda' as the parameter. LDA stands for Latent Dirichlet Allocation. We could’ve just as easily opted for other types of models.


Here’s the list of models that PyCaret currently supports:
* ‘lda’: Latent Dirichlet Allocation
* ‘lsi’: Latent Semantic Indexing
* ‘hdp’: Hierarchical Dirichlet Process
* ‘rp’: Random Projections
* ‘nmf’: Non-Negative Matrix Factorization

In [None]:
# create the model
lda = create_model('lda', num_topics = 6, multi_core = True)

## Assigning the Model

In [None]:
# label the data using trained model
df_lda = assign_model(lda)
df_lda.head()

## Plotting the Model

PyCarets offers a variety of plots. The type of graph generated will depend on the plot parameter. Here is the list of currently available visualizations:

* ‘frequency’: Word Token Frequency (default)
* ‘distribution’: Word Distribution Plot
* ‘bigram’: Bigram Frequency Plot
* ‘trigram’: Trigram Frequency Plot
* ‘sentiment’: Sentiment Polarity Plot
* ‘pos’: Part of Speech Frequency
* ‘tsne’: t-SNE (3d) Dimension Plot
* ‘topic_model’ : Topic Model (pyLDAvis)
* ‘topic_distribution’ : Topic Infer Distribution
* ‘wordcloud’: Word cloud
* ‘umap’: UMAP Dimensionality Plot

In [None]:
plot_model(lda, plot='topic_distribution')
plot_model(lda, plot='topic_model')
plot_model(lda, plot='wordcloud', topic_num = 'Topic 5')
plot_model(lda, plot='frequency', topic_num = 'Topic 5')
plot_model(lda, plot='bigram', topic_num = 'Topic 5')
plot_model(lda, plot='trigram', topic_num = 'Topic 5')
plot_model(lda, plot='distribution', topic_num = 'Topic 5')
plot_model(lda, plot='sentiment', topic_num = 'Topic 5')

*Using PyCaret library, we can easily experiment with Topic Models in a few lines of code. We covered the functions involved in each step and examined the parameters of those functions*

References: 
* https://pycaret.org/
* https://towardsdatascience.com/topic-modeling-on-pycaret-2ce0c65ba3ff

**Do remember to upvote if the notebook was helpful. It will motivate me in publishing more notebooks!! :-))**
    
Cheers!!