<a href="https://colab.research.google.com/github/joyjixu/qm2_resources/blob/main/data_analysis/month_violin.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
import pandas as pd

from plotly.subplots import make_subplots
import plotly.graph_objects as go
import plotly.figure_factory as ff
import numpy as np


In [None]:
sentiment_covid = pd.read_csv('covid_sentiment_daily_04_01.csv', engine='python')
sentiment_covid = sentiment_covid[['state', 'sentiment', 'magnitude']]
average_state_sentiment = pd.read_csv("average_sentiment_01.csv",  engine='python')

average_state_sentiment.info()
sentiment_covid.info()

In [None]:
 for day in range(2,31):
    if day<10:
        day = "0{}".format(day)
    
    path_average = ('average_sentiment_{}.csv'.format(day))
    path_sentiment_tweets = 'covid_sentiment_daily_04_{}.csv'.format(day)

    daily_average_state_sentiment = pd.read_csv(path_average, engine='python')
    daily_sentiment_covid = pd.read_csv(path_sentiment_tweets, engine='python')
    daily_sentiment_covid = daily_sentiment_covid[['state', 'sentiment', 'magnitude']]

    average_state_sentiment = average_state_sentiment.append(daily_average_state_sentiment, ignore_index=True)
    sentiment_covid = sentiment_covid.append(daily_sentiment_covid, ignore_index=True)

In [None]:
sentiment_covid.info()

The code here is taken from the Plotly library documentation and adapted to suit our needs, using help from StackOverflow
* https://plotly.com/python/choropleth-maps/
* https://plotly.com/python/mapbox-county-choropleth/
* https://plotly.com/python/reference/choropleth/
* https://plotly.com/python/subplots/#simple-subplot
* https://plotly.com/python/violin/#violin-plot-with-goviolin
* https://stackoverflow.com/questions/48687956/plotly-python-label-value-formatting


We create a plotly figure of a chloropleth map for average sentiment with violin plots for sentiment and magnitude distribution

In [None]:

fig = make_subplots(
        rows=1, cols=2,
        specs=[[{'type': 'histogram'}, {'type': 'histogram'}]],
        column_widths=[0.5, 0.5],
        subplot_titles=('Violin Plot of Sentiment Distribution', 'Violin Plot of Magnitude Distribution'))



fig.add_trace(
    go.Violin(y=sentiment_covid['sentiment'].to_numpy(),
                box_visible=True,
                line_color='black',
                meanline_visible=True, 
                fillcolor='lightseagreen', 
                opacity=0.6,
                y0='Sentiment',
                x0='Sentiment Spread', 
                showlegend=False
                ),
                row=1, col=1
)


fig.add_trace(
    go.Violin(y=sentiment_covid['magnitude'].to_numpy(),
                box_visible=True,
                line_color='black',
                meanline_visible=True, 
                fillcolor='lightseagreen', 
                opacity=0.6,
                y0='Magnitude',
                x0='Magnitude Spread', 
                showlegend=False),
                row=1, col=2
                
)


fig.update_traces(hovertemplate='<extra></extra>', selector=dict(type='histogram'))

fig.update_layout(
    title_text = 'Distribution of COVID-19 Tweets in the US, April 2020',
)


fig.update_layout(height=700, width=900, hovermode='x unified')
fig['layout']['yaxis']['title']='Sentiment'
fig['layout']['yaxis2']['title']='Magnitude'
fig['layout']['yaxis1'].update(tickformat="0.3r") # rounding to 3 significant figures
fig['layout']['yaxis2'].update(tickformat="0.3r") # rounding to 3 significant figures


fig.show()




Saving the plot as html to embed on our website:


In [None]:
fig.write_html("violin_month_updated.html")

In [None]:
#! pip install -U kaleido


In [None]:
fig.write_image("./violin_month_static_updated.png", engine="kaleido")