### Disclaimer

If you are viewing this file in github, the interactive plots in this file won't be shown as they are impemented via HTML + JS embedded code, which is not rendered in github. To view this file properly, you can open in using nbviewer at:

[https://nbviewer.org/github/gbulg/Data-analysis-class/blob/main/Final%20Project/handout.ipynb](https://nbviewer.org/github/gbulg/Data-analysis-class/blob/main/Final%20Project/handout.ipynb)

---
# Interactive Plots Handout


In addition to ordinary, static plots, there are ways to make plots interactive.


There are several libraries in python to make interactive plots. Here i will show two of them -- Plotly and Bokeh. Others include ... ?>??.

There may be several uses of interactive plots -- ??

different ways of interactivity -- hover, click, zoom, + ??

## 0. Load Data

In this handout I will use data from the DraCor Shakespeare corpus ([https://dracor.org/shake](https://dracor.org/shake)). It contains ?????.
Let's load it via API.

In [17]:
import pandas as pd
import requests

In [15]:
corpus_url = "https://dracor.org/api/corpora/shake"

metadata_info = requests.get(corpus_url + "/metadata", headers={"accept":"text/csv"}, stream=True)
metadata_info.raw.decode_content = True

In [16]:
metadata_df = pd.read_csv(metadata_info.raw, sep=",", encoding="utf-8")

In [18]:
metadata_df

Unnamed: 0,name,id,firstAuthor,numOfCoAuthors,title,subtitle,normalizedGenre,digitalSource,originalSourcePublisher,originalSourcePubPlace,...,numEdges,yearWritten,numOfSegments,wikipediaLinkCount,numOfActs,wordCountText,wordCountSp,wordCountStage,numOfP,numOfL
0,a-midsummer-night-s-dream,shake000008,Shakespeare,0,A Midsummer Night’s Dream,,,,,,...,205,1595,9,65,5,17772,17127,789,179,1749
1,all-s-well-that-ends-well,shake000012,Shakespeare,0,All’s Well That Ends Well,,,,,,...,127,1605,24,32,5,25066,24421,826,524,1627
2,antony-and-cleopatra,shake000035,Shakespeare,0,Antony and Cleopatra,,,,,,...,398,1606,42,39,5,27119,25878,1407,149,3236
3,as-you-like-it,shake000010,Shakespeare,0,As You Like It,,,,,,...,110,1599,23,48,5,23721,23113,970,597,1205
4,coriolanus,shake000026,Shakespeare,0,Coriolanus,,,,,,...,361,1608,29,36,5,29948,28851,1460,302,2984
5,cymbeline,shake000036,Shakespeare,0,Cymbeline,,,,,,...,195,1610,27,28,5,30141,29093,1802,149,3237
6,hamlet,shake000032,Shakespeare,0,Hamlet,,,,,,...,220,1602,20,94,5,32539,31665,1133,488,2922
7,henry-iv-part-i,shake000017,Shakespeare,0,"Henry IV, Part I",,,,,,...,147,1597,19,28,5,26290,25466,1057,455,1730
8,henry-iv-part-ii,shake000018,Shakespeare,0,"Henry IV, Part II",,,,,,...,232,1598,19,21,5,28461,27650,1015,620,1589
9,henry-v,shake000019,Shakespeare,0,Henry V,,,,,,...,206,1599,29,35,5,27982,27202,805,504,1825


In [19]:
metadata_df.columns

Index(['name', 'id', 'firstAuthor', 'numOfCoAuthors', 'title', 'subtitle',
       'normalizedGenre', 'digitalSource', 'originalSourcePublisher',
       'originalSourcePubPlace', 'originalSourceYear',
       'originalSourceNumberOfPages', 'yearNormalized', 'size', 'libretto',
       'averageClustering', 'density', 'averagePathLength', 'maxDegreeIds',
       'averageDegree', 'diameter', 'yearPremiered', 'yearPrinted',
       'maxDegree', 'numOfSpeakers', 'numOfSpeakersFemale',
       'numOfSpeakersMale', 'numOfSpeakersUnknown', 'numPersonGroups',
       'numConnectedComponents', 'numEdges', 'yearWritten', 'numOfSegments',
       'wikipediaLinkCount', 'numOfActs', 'wordCountText', 'wordCountSp',
       'wordCountStage', 'numOfP', 'numOfL'],
      dtype='object')

## 1. Plotly

In [4]:
import plotly.express as px
 
# using the iris dataset
df = px.data.iris()
 
# plotting the line chart
fig = px.line(df, y="sepal_width",)
 
# showing the plot
fig.show()

In [3]:
print("aaa")

aaa


I. Introduction

A. Explanation of what interactive plots are and their importance in data analysis

B. Overview of Python libraries used for creating interactive plots (e.g. Plotly, Bokeh)

II. Plotly

A. Explanation of Plotly library and its features

B. Installation and setup of Plotly in Python

C. Basic example of creating a line chart using Plotly

D. Advanced examples of creating interactive plots such as scatter plots, bar 
plots, heat maps, etc.

E. Customization options for Plotly plots (e.g. changing colors, markers, etc.)

III. Bokeh

A. Explanation of Bokeh library and its features

B. Installation and setup of Bokeh in Python

C. Basic example of creating a line chart using Bokeh

D. Advanced examples of creating interactive plots such as scatter plots, bar 
plots, heat maps, etc.

E. Customization options for Bokeh plots (e.g. changing colors, markers, etc.)

IV. Comparison between Plotly and Bokeh

A. Pros and cons of each library

B. Use cases for each library

C. Comparison of the functionality and customization options of each library

V. Best practices for creating interactive plots

A. Choosing the right library for a specific use case

B. Importance of data preparation for interactive plots

C. Design considerations for creating effective and user-friendly interactive plots

VI. Conclusion

A. Recap of the importance of interactive plots in data analysis

B. Summary of the key points covered in the handout

C. Recommendations for further study and resources.

In this plan, you can start with a brief overview of what interactive plots are, followed by a detailed explanation of two popular libraries in Python: Plotly and Bokeh. You can then compare the two libraries, highlighting their pros and cons, use cases, and functionality. Finally, you can conclude with some best practices and recommendations for further study.

This should give you a comprehensive handout on interactive plots in Python that covers both the basics and advanced topics.

### Interactive Plots in Python

Interactive plots allow users to explore and manipulate data in a more engaging and informative way than static plots. In Python, there are several libraries that provide functionality for creating interactive plots, such as Plotly, Bokeh, and Altair. These libraries offer a wide range of interactive features, such as zooming, panning, hovering, and selecting data points.

### Usage in Linguistic Research

Interactive plots can be particularly useful in linguistic research, where complex linguistic data often requires visual exploration to understand patterns and relationships. Here are some examples of how interactive plots can be used in linguistic research:

Visualizing word frequency distributions: Interactive plots can be used to create interactive word clouds, where users can zoom in and out to see the most frequent words in a corpus. This is especially useful in linguistic research as it allows researchers to quickly identify the most important words in a large corpus of text.

Plotting linguistic change over time: Interactive line charts can be used to plot linguistic change over time, allowing researchers to see the trends in language usage. For example, researchers can plot the frequency of certain words or phrases over time to see how they have changed in usage.

Analyzing lexical similarity: Interactive plots can be used to visualize the similarity between words in a lexicon. Researchers can create interactive heatmaps to see the similarity between words, or use interactive scatter plots to see the relationships between words.

Exploring linguistic diversity: Interactive plots can be used to explore linguistic diversity in a corpus. For example, researchers can use interactive bar charts to plot the frequency of different languages in a corpus, or use interactive scatter plots to visualize the diversity of language usage in different regions.

### Conclusion

In conclusion, interactive plots are a powerful tool for exploring and visualizing linguistic data. By allowing users to interact with data, they provide a more engaging and informative way to understand linguistic patterns and relationships. Whether you're a researcher, student, or simply someone interested in language, interactive plots can help you get the most out of your linguistic data.

In [1]:
print('Hello world')

Hello world


(6p) a handout on one of the topics we haven't covered in class, e.g., BERT, clustering, machine learning, regression, interactive plots

(2p) a corresponding homework with a solution