# Class 26: LLMs and widgets

Plan for today:
- LLMs and Hugging Face
- Jupyter widgets
- Bonus material: Creating a website on GitHub


In [None]:
import YData

# YData.download.download_class_code(26)   # get class code    
# YData.download.download_class_code(26, TRUE) # get the code with the answers 



If you are using colabs, you should run the code below.

In [None]:
# !pip install https://github.com/emeyers/YData_package/tarball/master
# from google.colab import drive
# drive.mount('/content/drive')

In [None]:
import pandas as pd
import numpy as np
import seaborn as sns
from urllib.request import urlopen

import matplotlib.pyplot as plt
%matplotlib inline



## 1. Large language models (LLMs)

Large language models (LLMs) are taking over the world. I, for one, welcome our new robot [overlords](https://www.youtube.com/watch?v=8lcUHQYhPTE).

Let's explore how we can use a model from Hugging Face to create a chatbot.

To do this we need to install some additional packages. I recommend cloning your Jupyter environment, and then adding these packages to the new environment.


In [None]:
## If you are using uv, you might need to uncomment the lines below to install the following packages
# !uv add torch
# !uv add transformers

In [None]:
# Modified from code created by Giuliano Formisano
# Updated to work with new version of hugging face packages

from transformers import pipeline, BlenderbotTokenizer, BlenderbotForConditionalGeneration

model_name = "facebook/blenderbot-400M-distill"

# Option A: Using text2text-generation pipeline (single-turn):
chatbot = pipeline("text2text-generation", model=model_name, tokenizer=model_name, device = "cpu")

# This uses "mps" by default on my mac which should be faster but there appears to be a bug in the code
#chatbot = pipeline("text2text-generation", model=model_name, tokenizer=model_name)


user_input = "Hi! What can you do?"
response = chatbot(user_input, max_new_tokens=50)

print(f"User:  {user_input}")
print(f"Bot:   {response[0]['generated_text']}")



### Loop for an interaction User-Chatbot

In [None]:
# Loop of interaction user-chatbot
while True:
  user_input = input("You: ") # add prompt in the appearing box below
  if user_input.lower() == "quit": # write "quit" to interrupt
    break
  response = chatbot(user_input) # this is a bit slow
  print(f"Chatbot: {response[0]['generated_text']}")

### Hugging Face data sets

In [None]:
# Hugging Face also has a number of large/interesting data sets
# (some of them controversial, and of course, one should be cautious of the veracity of all data sets)

# Load a data set 

from datasets import load_dataset

#emails = load_dataset("tensonaut/EPSTEIN_FILES_20K", split="train")
emails = load_dataset("corbt/enron-emails", split="train")


# print type and shape
print(emails.shape)
print(type(emails))


# convert to a pandas data frame
emails_df = emails.to_pandas()

display(emails_df.head())


In [None]:
# search for keywords
emails_containing = "summers" 

column_name = "body"  
#column_name = "text"  


# rfind returns first index where word occurs
selected_emails = emails_df[(emails_df[column_name].str.lower().str.rfind(emails_containing) > 0)]


print(selected_emails.shape)
display(selected_emails.head())

print(selected_emails.iloc[0][column_name])

## 2. Widgets

We can add interactive "widgets" (i.e., buttons, slides, checkboxes, etc.) to a Jupyter notebook which let us explore our data. 

Let's examine this now...


### Slider

In [None]:
import ipywidgets as widgets
from IPython.display import display


In [None]:
# Create a slider




In [None]:
# Get the value of the slider



In [None]:
# Modify a data based on the slider value



### Updating a figure

We can use a widget to update a figure.

This example is from: https://www.youtube.com/watch?v=wb6k_T4rKBQ&t=3s

In [None]:
#!pip install jupyter_contrib_nbextensions
#!jupyter nbextension enable --py widgetsnbextension

In [None]:
cars = sns.load_dataset("mpg")










In [None]:
# If we instead make the function argument a Boolean, a checkbox widget is added








In [None]:
# Another example













In [None]:
%%capture

# You can run this code to covert this Jupyter notebook into a pdf
!quarto render class_26.ipynb --cache-refresh --to pdf 

## 3. Displaying Jupyter notebooks as websites

If we save our Jupyter notebooks as .html documents we can upload the .html documents to GitHub and view them as webpages on the Internet. This can be useful to show off your work to others (e.g., potential employers, etc.). 

We can create a website on GitHub by completing the following instructions:

1. Create an account on [GitHub.com](github.com)  
2. Create new repo. Call the repo "YData_website" or a similar name. Make sure that you select to include a readme file   
3. Click on Settings (top right below the repo name). Click on "Pages" on the left menu (at the bottom of "Code and automation section)  
4. Select the Branch to be "main"  
5. Upload an html document to the main GitHub repository; e.g., upload a webpage called "simple_page.html"   
6. Website is available at: https://[username].github.io/[repo_name]/simple_page.html   (where "repo_name" is the repo name from step 2)  

You can also create interactive notebooks by uploading jupyter .ipynb files to the GitHub site and then using https://mybinder.org/ to render them. 


In [None]:
import plotly.express as px
import plotly
plotly.offline.init_notebook_mode()  # allows interactive graphics to work when saved as a hmtl document


# Create some interactive graphics to make this into an interesting webpage
gapminder = px.data.gapminder()   # the plotly package comes with the gapminder data
gapminder_2007 = gapminder[gapminder['year'] == 2007]
gapminder_2007_alt = gapminder.query("year==2007")
gapminder_2007.equals(gapminder_2007_alt)

# Create an animated scatter plot
fig = px.scatter(gapminder, 
                 x="gdpPercap", 
                 y="lifeExp", 
                 animation_frame="year", 
                 animation_group="country",
                 size="pop", 
                 color="continent", 
                 hover_name="country", 
                 facet_col="continent",
                 log_x=True, 
                 size_max=45, 
                 range_x=[100,100000], 
                 range_y=[25,90])
fig.show()

# Create a sunburst plot
fig = px.sunburst(gapminder_2007, 
                  path=['continent', 'country'], 
                  values='pop', 
                  color='lifeExp')   
fig.update_layout(width = 500, height = 500)
fig.show()

# Create a treemap
fig = px.treemap(gapminder_2007, 
                 path=[px.Constant('world'), 'continent', 'country'], 
                 values='pop', 
                 color='lifeExp')
                 #color='gdpPercap')

fig.show()

In [None]:
# Render the document as an html document so it can be shown on the web (e.g., on GitHub pages)
# Be sure to also update the Quarto header at the top of the notebook to include  html:  embed-resources: true  (as is done in this notebook)

!quarto render class_26.ipynb --cache-refresh --to html
