# **Identifying Trends in Social Media Data using Machine Learning in Python**

This dataset includes 10 rows of social media posts, each with an id and text column. The text column contains the text of the social media post.

In [None]:
import pandas as pd
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.decomposition import LatentDirichletAllocation

# Load the data and create the feature matrix
df = pd.read_csv('social_media_posts.csv')
vectorizer = CountVectorizer()
X = vectorizer.fit_transform(df['text'])

# Fit a Latent Dirichlet Allocation model to the data
lda = LatentDirichletAllocation(n_components=5)
lda.fit(X)

# Print the top 10 words for each topic
top_words = 10
feature_names = vectorizer.get_feature_names()
for topic_idx, topic in enumerate(lda.components_):
    print(f"Topic {topic_idx+1}:")
    print(" ".join([feature_names[i] for i in topic.argsort()[:-top_words - 1:-1]]))
    print()


In this code, the data is loaded from a CSV file called social_media_posts.csv and the feature matrix is created using the text column of the dataframe. The data is then fit to a Latent Dirichlet Allocation (LDA) model, which is a type of machine learning algorithm that is often used for topic modeling. Finally, the top 10 words for each of the identified topics are printed to the console.