# Wrangling Humanities Data with AI and R â€“ A Very Brief Introduction 

In this workshop we will explore how generative AI (ChatGPT, Copilot, etc.) can help you create R code for working with humanistic data.
This session focuses on a set of letters written from 1915-1919 by H.J.C. (Jack) Peirs, a British World War I officer.
This is **structured data**, meaning the various elements are in pre-defined rows and columns in a spreadsheet.

### What we will do:
1. Load the letters dataset.
2. Explore its structure.
3. Use generative AI to write R code that:
   - visualizes the data,
   - analyzes word usage,
   - or performs any analysis you choose.
4. Reflect on how AI + R can support humanities research.

You will **generate most of the R code yourself** by asking an AI tool.


## How to Run Code in This Notebook

- Click any code cell.
- Press **Shift + Enter** to run it.
- New cells can be added with the "+" button above.

If a cell returns an error:
- Check the error message.
- Ask an AI tool to help fix it, usually by copying and pasting it into the AI tool.
- Try updating column names or syntax accordingly.

Errors are normal — they’re part of the process!


In [None]:
library(tidyverse)
library(readr)
library(dplyr)
library(stringr)
library(lubridate)
library(tidytext)
library(ggplot2)
library(syuzhet)
library(wordcloud)
library(RColorBrewer)

sessionInfo()


# Loading the Letters Dataset

The file **letters.csv** is in the same folder as this notebook. When we load it, we will see the first few rows. We should also see the column names, **text**, **recipient**, and **date** (in YYYY-MM-dd format). The column names are important when working with AI to generate R scripts.

Run the next cell to load it.



In [None]:
letters <- read_csv("letters.csv", show_col_types = FALSE)

# Peek at the dataset
head(letters)


# Using AI to Write R Code

You will now generate your own R code with the help of a generative AI tool.

### Successful prompt writing ### 
Successful prompt writing requires that you provide the AI tool some information about your environment and data. In this case, we can say:
**I am using an existing mybinder.org notebook to run R scripts. This includes a file, letters.csv, which has the columns text, date (in year-month-date format) and recipient. The mybinder has the following R libraries preloaded: **
**tidyverse, readr, dplyr, stringr, lubridate, tidytext, ggplot2, syuzhet, wordcloud, RColorBrewer**
/nYou can even do this before you even ask the AI tool to start writing R scripts to get it ready for what comes next.



# Task 1: Create a Visualization Using AI-Generated Code

First, prime everything by giving your AI tool your environment and data. Then, you can ask it to write an R script.
### Example prompts you might use:
- **“Write R code that counts how many letters were written per year and creates a bar chart.”**
- **“Write R code that extracts the 20 most frequent non-stopword words from the `text` column in letters.csv and creates a word cloud”**
- **“Write R code that filters letters written between 1914 and 1918 and graphs them by month.”**

### Instructions:
1. Ask an AI tool to write the R code for the visualization.  
2. Paste **only the code** in the cell below.  
3. Run it by pressing **Shift+Enter**.  
4. If errors appear, copy and paste the code and ask the AI to help debug.


In [None]:
# Paste AI-generated visualization code here and run it.



# Task 2: Analyze the Text Using AI-Generated Code

Choose one of these text-analysis tasks:

1. **Find the most common words used by year or recipient**  
2. **Find the most common bigrams** (two-word phrases)  
3. **Calculate me sentiment analysis scores for each recipient, and also tell me which dates correspond to the highest and lowest scores**  

### Instructions:
1. Ask an AI tool to write the R code for the visualization.  
2. Paste **only the code** in the cell below.  
3. Run it by pressing **Shift+Enter**.  
4. If errors appear, copy and paste the code and ask the AI to help debug.


In [None]:
# Paste AI-generated text-analysis code here and run it.

