Name		Name	Last commit message	Last commit date
parent directory ..
Figures		Figures
2024-03-12_Marija_Iloska_Session4.ipynb		2024-03-12_Marija_Iloska_Session4.ipynb
Homework_1.ipynb		Homework_1.ipynb
Life_On_The_Mississippi.txt		Life_On_The_Mississippi.txt
README.md		README.md
Sequential_Data_Models.ipynb		Sequential_Data_Models.ipynb
Session_4_Homework_Solutions.ipynb		Session_4_Homework_Solutions.ipynb
Untitled.ipynb		Untitled.ipynb
viz-bert-voc-tsne10k-viz4k-noadj.pdf		viz-bert-voc-tsne10k-viz4k-noadj.pdf
vocab.txt		vocab.txt

README.md

Introduction to Large Language Models

Author/Perpetrator: Carlo Graziani, including materials on LLMs by Varuni Sastri, and discussion/editorial work by Taylor Childers, Archit Vasan, Bethany Lusch, and Venkat Vishwanath (Argonne)

Word embedding visualizations adapted from Kevin Gimpel (Toyota Technological Institute at Chicago) Visualizing BERT.

This tutorial covers the some fundamental concepts necessary to to study of large language models (LLMs). The goal is to set the table for Archit Vasan's exploration of LLM pipelines, next week.

Environment Setup (thanks, Bethany)

If you are using ALCF, first log in. From a terminal run the following command:

ssh username@polaris.alcf.anl.gov

Although we already cloned the repo before, you'll want the updated version. To be reminded of the instructions for syncing your fork, click here.
We will be downloading data in our Jupyter notebook, which runs on hardware that by default has no Internet access. From the terminal on Polaris, edit the ~/.bash_profile file to have these proxy settings:

export HTTP_PROXY="http://proxy-01.pub.alcf.anl.gov:3128"
export HTTPS_PROXY="http://proxy-01.pub.alcf.anl.gov:3128"
export http_proxy="http://proxy-01.pub.alcf.anl.gov:3128"
export https_proxy="http://proxy-01.pub.alcf.anl.gov:3128"
export ftp_proxy="http://proxy-01.pub.alcf.anl.gov:3128"
export no_proxy="admin,polaris-adminvm-01,localhost,*.cm.polaris.alcf.anl.gov,polaris-*,*.polaris.alcf.anl.gov,*.alcf.anl.gov"

Now that we have the updated notebooks, we can open them. If you are using ALCF JupyterHub or Google Colab, you can be reminded of the steps here.
Reminder: Change the notebook's kernel to datascience/conda-2023-01-10 (you may need to change kernel each time you open a notebook for the first time):
1. select Kernel in the menu bar
2. select Change kernel...
3. select datascience/conda-2023-01-10 from the drop-down menu

References:

I strongly recommend reading "The Illustrated Transformer" by Jay Alammar, before next week's deeper dive into Transformer tech by Archit Vasan. Alammar also has a useful post dedicated more generally to Sequence-to-Sequence modeling "Visualizing A Neural Machine Translation Model (Mechanics of Seq2seq Models With Attention), which illustrates the attention mechanism in the context of a more generic language translation model.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

04_intro_to_llms

04_intro_to_llms

README.md

Introduction to Large Language Models

Environment Setup (thanks, Bethany)

References:

Files

04_intro_to_llms

Directory actions

More options

Directory actions

More options

Latest commit

History

04_intro_to_llms

Folders and files

parent directory

README.md

Introduction to Large Language Models

Environment Setup (thanks, Bethany)

References: