# Data journalism: data visualisation – implementation of interactive graphs (web enabled), infographics.

This notebook explores how sentiment and metadata from social media posts can be used to predict user engagement (likes + retweets). We also correlate trending news topics to online activity. This will help jouranlists find tredning topics via Social media and see how they effect each other


### Libraries Needed: 



import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from textblob import TextBlob
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import r2_score, mean_squared_error

# add here when needed.

## Data sources: 
For social media, we use the X api [1]. This allows you gather posts from X within defined parameters. This will be done using hashtags these usally represent trending topics [2]. The newsAPI [3] will be used to gather news articles based on the paramters from the X posts, for example #Fitness retreived the X posts will be the search paramter for the news posts. 

## *Please add other data sources here if used*


1. https://docs.x.com/x-api/introduction
2. https://www.shopify.com/nz/blog/twitter-hashtags
3. https://newsapi.org/

# Limitations: 

# Ethical data usage: 


### X: 
The X API can be used for a university project if it aligns with X’s License Agreement, prioritizing user privacy, transparency, and ethical data use while avoiding harmful applications like misinformation or unauthorized data scraping. Ensure compliance with platform policies and secure data handling, especially for public interest research, though access may require navigating paid tiers or specific approvals under regulations like the EU’s DSA. (https://developer.x.com/en/developer-terms/agreement-and-policy) 

### NewsAPI: 
The News API (https://newsapi.org/terms) can be ethically used for a university project by adhering to its terms, which require lawful data use, compliance with local regulations, and respecting intellectual property through proper source attribution. Ensure transparency, secure handling of the API key, and limit data use to non-commercial academic purposes within the free tier’s 500 requests/day, avoiding unauthorized redistribution of licensed content.


### Here is how the NewsAPI is used. This wont run on this notebook. 


In [3]:

# --- Initialize News API ---
API_KEY = "7af7d5e56edc4148aac908f2c9f86ac3"  
newsapi = NewsApiClient(api_key=API_KEY)

st.title("📊 Real-Time Social + News Dashboard with Engagement Forecasting")

# --- User Topic Input ---
topic = st.text_input("Enter a topic keyword (e.g., #Fitness, climate change):", "#Fitness")

# --- News Fetching ---
if topic:
    with st.spinner("Fetching news articles..."):
        all_articles = newsapi.get_everything(
            q=topic,
            language='en',
            sort_by='publishedAt',
            page_size=10
        )
    articles = all_articles.get('articles', [])

    st.header(f"📰 Latest News on {topic}")
    if articles:
        for article in articles:
            st.subheader(article['title'])
            st.write(article['description'])
            st.markdown(f"[Read more]({article['url']})")
            st.write(f"Published at: {article['publishedAt']}")
            st.markdown("---")
    else:
        st.write("No news articles found for this topic.")

# --- Load Dataset ---
@st.cache_data
def load_social_data():
    df = pd.read_csv("data/x_posts_with_weather.csv")
    df['created_at'] = pd.to_datetime(df['created_at'], errors='coerce')
    return df

df = load_social_data()

# --- Filter Dataset ---
if topic:
    mask = df['hashtags'].str.contains(topic.replace("#", ""), case=False, na=False)
    filtered_df = df[mask]

    st.header(f"📱 Social Media Posts on {topic}")
    st.write(f"Total posts found: {filtered_df.shape[0]}")

    if not filtered_df.empty:
        st.line_chart(filtered_df.groupby(filtered_df['created_at'].dt.floor('H')).size())
    else:
        st.write("No social media posts found for this topic.")

NameError: name 'NewsApiClient' is not defined