# Introduction

## Vision

Our goal for this project is to explore how machine learning can enhance the quality of academic essays and papers. We begin by selecting an existing essay and using a machine learning model to create an initial outline of it. This outline is then checked and corrected by a person to ensure it is accurate and relevant. Subsequently, we use the corrected outline to guide the ML model in writing a new version of the essay.

To evaluate the effectiveness of our method, we compare the original essay, written by a human, with the new one written by the model. We closely examine each word (token) in both essays and calculate how well each word fits. If the model's word is a better fit, we replace the original word with the AI's choice. This process allows us to experiment with different ways of providing information to the model, such as using just the original essay, the corrected outline, both together, the outline made by the model, or even no context at all.

Our aim is to determine which method of providing information enables the model to perform its best in terms of making fewer mistakes and maintaining clear and well-organized text. If our project is successful, it would demonstrate that AI can help improve the original essay, resulting in a final piece that is clear, well-written, and enhanced with the help of the model.

## Background

For this project, the primary data consists of essays from previous college courses written by the group members. This dataset was selected because it provides authentic academic writing samples that reflect the typical structures, themes, and complexity expected in college-level essays. These essays serve as an ideal basis for evaluating the effectiveness of machine learning models in generating and refining academic text, as they represent real-world examples of content that students are expected to produce. Our confidence in the primary dataset stems from several factors:
- Relevance: Essays from previous courses are directly relevant to the project's aim, which is to enhance the quality and coherence of academic essays using AI.
- Diversity: These essays encompass a range of subjects and writing styles, providing a robust base for testing the AI's adaptability and accuracy in generating and refining text.
- Authenticity: Using actual student essays ensures that the project results are applicable in real-world educational settings, enhancing the practical value of the research.

# Team & Responsibilities

- Tyler Arista
  - Vision Statement
  - Background
  - Implemented Context Approaches
    - Model Generated Outline from Human essay
    - Generate new essay from Human Outline
    - Generate new essay from Human outline & essay
  - Modified Prompt Engineering Inputs
  - Final Presentation
- Lydia Powers
  - Introduction
  - Gathered Data
  - Prompt Engineering Approaches
  - Final Presentation
- Chang Liu
  - Implications
  - Implementation
  - Limitations & Future Directions
  - Final Presentation

# Methodology

### Model Selection
We chose the GEMMA model from Google's Transformers library due to its advanced architecture that includes attention mechanisms, which are particularly effective for tasks involving natural language understanding and generation. The decision to utilize a pre-trained model and further fine-tune it on our specific dataset leverages the principle of transfer learning, which helps in achieving higher performance without the need for an extensive and computationally expensive training process from scratch.

### Data Preprocessing
Our dataset comprises academic essays, which we preprocessed by tokenizing using the tokenizer associated with our chosen model. This step is crucial as it converts raw text into a format that the model can understand, thereby reducing complexity and enhancing model training efficiency. Proper data structuring is key to minimizing semantic errors during generation.

### Training Methodology
The model was fine-tuned on our academic essay dataset using a specific batch size and learning rate. We monitored the training process for signs of overfitting by evaluating the loss on a validation set at each epoch. Adjustments to learning rates were made based on the model's performance on the validation set.

### Error Analysis
Error analysis involved examining the types of errors produced in the generated essays. This analysis is crucial for diagnosing issues with the model’s understanding and output, guiding subsequent rounds of fine-tuning.

### Performance Evaluation
We employed several metrics to evaluate the effectiveness of the model, including token loss and BLEU scores. These metrics help quantify how well the model performs in terms of generating coherent, grammatically correct, and contextually relevant text.


# Implementation<br>

## Setup and Configuration
Detail the environment setup, including any libraries and frameworks used

In [1]:
!pip install textstat

Collecting textstat
  Downloading textstat-0.7.3-py3-none-any.whl.metadata (14 kB)
Collecting pyphen (from textstat)
  Downloading pyphen-0.15.0-py3-none-any.whl.metadata (3.3 kB)
Downloading textstat-0.7.3-py3-none-any.whl (105 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m105.1/105.1 kB[0m [31m3.4 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading pyphen-0.15.0-py3-none-any.whl (2.1 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.1/2.1 MB[0m [31m41.9 MB/s[0m eta [36m0:00:00[0m00:01[0m
[?25hInstalling collected packages: pyphen, textstat
Successfully installed pyphen-0.15.0 textstat-0.7.3


In [2]:
import torch, os
import textstat
import nltk
from nltk.translate.bleu_score import sentence_bleu, SmoothingFunction
from nltk import sent_tokenize, word_tokenize
from transformers import AutoTokenizer, AutoModelForCausalLM, TextStreamer
# Work around a bug in the version of PyTorch and GPU hardware curretnly on Kaggle. On other hardware, removing these lines may lead to a speed-up.
torch.backends.cuda.enable_mem_efficient_sdp(False)
torch.backends.cuda.enable_flash_sdp(False)

# Load the model
USE_INSTRUCTION_TUNED = True # we'll switch this to True partway through the lab
if USE_INSTRUCTION_TUNED:
    model_name = '/kaggle/input/gemma/transformers/1.1-2b-it/1'
    if not os.path.exists(model_name):
        print("Warning: loading model weights from the Internet. This might take a bit of extra time.")
        model_name = "google/gemma-1.1-2b-it"
else:
    model_name = "/kaggle/input/gemma/transformers/2b/2"
    if not os.path.exists(model_name):
        print("Warning: loading model weights from the Internet. This might take a bit of extra time.")
        model_name = "google/gemma-2b"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    device_map='auto',
    torch_dtype=torch.bfloat16)
streamer = TextStreamer(tokenizer)
# Silence a warning.
tokenizer.decode([tokenizer.eos_token_id]);

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

2024-04-30 17:26:17.165034: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-04-30 17:26:17.165178: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-04-30 17:26:17.298845: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered


## Data Preparation
In this section of our project, we organized and prepared the necessary materials to ensure a structured approach to our analysis. We started by defining the essay prompt that challenged us to explore the deliberate artistic strategies used by the authors and filmmaker to present their historical narratives

### P1's Essay Components
##### Prompt

In [3]:
p1_essay_prompt = '''In Homegoing, Pachinko, and Stories We Tell, the authors/filmmaker choose deliberate artistic strategies to present the histories they narrate. Discuss how the literary/filmic choices of Gyasi, Lee, and Polley are part of their overall theory about how the past should be constructed. In other words, how does the way in which they present their stories intersect with what they are trying to say in their stories? Are there commonalities in their aims? If so, what are they? Are there critical differences?'''

#### Outline given by the User

In [4]:
p1_outline_first_paragraph = '''
Thesis: Authors and filmmaker use multiple perspectives to illustrate how personal and ancestral choices shape individual narratives and identities, demonstrating their theories on how history should be constructed.
Brief introduction of works and creators: Discusses Homegoing by Yaa Gyasi, Pachinko by Min Jin Lee, and Stories We Tell by Sarah Polley, setting the stage for an exploration of their narrative techniques.
'''

In [5]:
p1_outline_second_paragraph = '''
Main idea: Personal decisions directly shape characters' identities and futures in all three works, highlighting the authors' and filmmaker's focus on the impact of individual agency within broader historical and social contexts.
Explanation: This exploration of personal choice aligns with the creators' views that history is not merely a series of events but a complex tapestry woven from individual actions and their consequences.
Example from Pachinko: Sunja's pivotal decision to engage with Hansu, and its ramifications, showcase how personal mistakes and moral dilemmas are central to character development and plot progression.
Example from Homegoing: The character H’s experience with forced labor and subsequent physical development illustrates how personal endurance and adaptation to circumstances reflect broader historical forces like slavery and institutional racism.
Example from Stories We Tell: Harry’s narrative and his one-sided love affair demonstrate how personal perceptions can deeply influence one’s identity and the stories they choose to tell or believe.
'''

In [6]:
p1_outline_third_paragraph = '''
Main idea: Characters’ identities and life paths are significantly influenced by the actions and statuses of those around them and their ancestors, emphasizing the interconnectedness of personal histories within larger societal narratives.
Explanation: This theme underscores the authors' and filmmaker's perspective that individual lives are not isolated but are deeply affected by the historical and relational contexts in which they exist, echoing a broader theory that history is constructed collectively rather than singularly.
Example from Pachinko: Isak’s altruistic decision to marry Sunja provides a stark contrast to her initial dilemma, showing how benevolent actions from others can redirect an individual’s life trajectory dramatically.
Example from Homegoing: The legacy of slavery, as seen through Esi and her descendant Ness, highlights how ancestral histories cast long shadows over the lives of future generations, shaping identities and opportunities long after the original events have passed.
Example from Stories We Tell: The revelation of Sarah’s true paternity and Diane’s decisions regarding her upbringing illustrate the profound impact parental choices have on children’s identities and their understanding of family narratives.
'''

#### Human Written Essay

In [18]:
p1_essay_first_paragraph = '''Many people believe that there is only one story going on in the world, and they are the main characters. They believe that everything should revolve around them, and everybody else in this world is a small side character. We often don't think that there are billions of different stories happening all at the same time, which can then affect our narratives through their perspectives. The authors and filmmakers of Homegoing, Pachinko, and Stories We Tell, use the perspective of many people in similar situations to demonstrate how the narratives of people’s lives can be shaped and defined by the choices they make personally, and the people around them including their ancestors. '''

p1_flesch_reading_ease_first_paragraph = textstat.flesch_reading_ease(p1_essay_first_paragraph)
print(f"Flesch Reading Ease: {p1_flesch_reading_ease_first_paragraph}")

Flesch Reading Ease: 51.52


In [20]:
p1_essay_second_paragraph = '''People’s identities are often defined by the choices they made in the past, whether they are successful or unsuccessful, honorable or dishonorable. The authors and filmmakers of these stories demonstrate how a person’s choices can shape their narratives. In the novel, Pachinko by Min Jin Lee, we mainly see the story through the viewpoint of Sunja, but we will occasionally get to look through or see the thoughts of the people close to her. In the novel, it states, “If he did not marry her, she was a common slut who would be disgraced forever. The child would be another no-name bastard. Her mother’s boardinghouse would be contaminated by her shame”(Lee 49). Sunja made a big mistake by being with Hansu, and she paid the price. She was now pregnant, and the father of the child won’t be able to marry her because he is already married. In her society, bearing a child without a father will lead to having the mother, in this case, Sunja, be disowned. It could also lead to her family and her child being disowned as well. Looking through the lens of Sunja, we could see how devastated she was and the impact her decisions could make on her in the future. Also, in the novel, Homegoing by Yaa Gyasi, we see through many different lenses in many different generations. In the chapter about H, it states, “The boss man was called Mr. John. He asked to take off his shirt. He inspected the muscles on his back and his arms and whistled. ‘Any man what can spend ten years working at Rock Slope and live to tell about it’s worth watching”(Gyasi 169). When H was sent to me while he was arrested, he might not have had an option on whether or not he wanted to work, but it made him better in the long run. He had gotten physically strong, and when he had gotten released, he was able to find a mining job that paid him. The situation he was put in shaped the person that H was. He became a hard worker and he made sure he did his job. Those were some of the things that he learned while he mined for the jail, but he was able to apply those lessons to the real world and shaped him to be somebody that was hirable. Another example of identities being defined by self choices is in the documentary, Stories We Tell, produced by Sarah Polley. When Harry met with Diane when she traveled to Montreal for a play, Harry fell in love with her. He had his own one-sided story, where he thought he was going to be with Diane for the rest of his life. This ultimately shaped his identity for the future. At the end of the documentary, Sarah asked Harry whether he liked that many people are sharing their sides of the story or if he disliked it. He shared that he did not like how many people shared their sides because he thought it was only his story to tell. He got wrapped up in the sense that Diane only loved him, and that changed the way he viewed his story.'''

p1_flesch_reading_ease_second_paragraph = textstat.flesch_reading_ease(p1_essay_second_paragraph)
print(f"Flesch Reading Ease: {p1_flesch_reading_ease_second_paragraph}")

Flesch Reading Ease: 77.47


In [21]:
p1_essay_third_paragraph = '''While some believe that their identity is solely based on their choices, people’s identities can be altered by the people around them and by their ancestors. When we start looking through the other lenses of people in the same situation, we can see that many times the expected outcome is different from reality. In the novel, Pachinko, it states, “‘Of course it would be far better for them if she went away’ Yangjin replied, knowing the hard truth. ‘The child would have a terrible life here. You’d be saving my daughter’s life as well'”(Lee 74). Isak knew the dilemma Sunja was in and he knew this was something that shouldn’t be taken lightly. After coming up with the idea of marrying Sunja and giving the child his last name, he had lifted the burden off Sunja and her child’s back. They would no longer have to bear the weight of being dishonored by society and her family. Having people who can have a different perspective of the matter, can allow one to alter the course of somebody. Then, in the novel, Homegoing, we were able to witness the identities being altered because of their past relatives. In the novel, it states, “Every day, Ness picked cotton under the punishing eye of the southern sun. She had been at Thomas Allan Stockham’s Alabama plantation for three months”(Gyasi 70). Ness was a descendant of Esi in this story. Esi was a part of the slave trade system and her life was very rough. Esi went through an enormous amount of abuse and unbearable situations. Since being in that slave system, it had affected the rest of her family tree. Ness is the daughter of Esi, and so, Ness had to bear the identity of being a slave just like her mother. She had to be a slave on a plantation and no choice she made would be able to change that in this system. Her identity at the time was determined by her mother and how she was a part of the system. Lastly, in the documentary, Stories We Tell, Sarah’s identity was shaped by the choices her mother made when she was alive. Sarah never knew that Michael was not her biological father until she grew up. Sarah was able to form a strong bond with Michael, and that was because Diane decided to have Michael take care of her instead of Harry. Once Sarah found out that Harry was her father, Sarah was still able to keep the close relationship with Michael even after finding out the truth. Sarah was able to keep her identity from the one she formed with Michael, and that was all due to Diane.'''

p1_flesch_reading_ease_third_paragraph = textstat.flesch_reading_ease(p1_essay_third_paragraph)
print(f"Flesch Reading Ease: {p1_flesch_reading_ease_third_paragraph}")

Flesch Reading Ease: 77.87


### P2's Essay Components
#### Prompt

In [22]:
p2_prompt = '''Write a one-page essay based on the book Things Fall Apart by Chinua Achebe that discusses the relationship between Okonkwo and Ikemefuna'''

#### Outline given by the User

In [23]:
p2_outline_first_paragraph = '''
Brief introduction to the context of the scene where Okonkwo eats a locust.
Mention of Ezeudu’s arrival and the news he brings about Ikemefuna’s fate.'''

In [24]:
p2_outline_second_paragraph = '''
Description of Ezeudu as a respected elder and his unique position to confront Okonkwo.
Examination of Okonkwo’s motivations: fear of appearing weak and his disregard for Ezeudu’s advice.
'''

In [25]:
p2_outline_third_paragraph = '''
Reflection on Okonkwo’s fondness for Ikemefuna and the impact on his son Nwoye.
Analysis of the father-son dynamic between Okonkwo and Ikemefuna.
'''

In [26]:
p2_outline_forth_paragraph = '''
Exploration of Okonkwo’s internal conflict between his self-image and his love for Ikemefuna.
Detailed recount of the moment Okonkwo kills Ikemefuna, highlighting the betrayal and its emotional impact.
'''

In [27]:
p2_outline_fifth_paragraph = '''
Discussion of how Okonkwo’s perception of his own father influences his actions and self-perception.
Analysis of the significance of being called "father" by Ikemefuna and its implications on Okonkwo’s decision.
'''

In [28]:
p2_outline_sixth_paragraph = '''
Summation of how Okonkwo’s character complexities and his fear of weakness lead to tragic choices.
Reflection on the themes of identity and legacy in "Things Fall Apart" as illustrated through Okonkwo’s actions and decisions.
'''

#### Human Written Essay

In [29]:
p2_essay_first_paragraph = '''When I first start reading the passage, I think about what Okonkwo is eating. I have never eaten a locust and I am not sure if I would want to. But it is interesting that Okonkwo is eating something considered a rare delicacy when Ezeudu comes to tell him that Ikemefuna is going to die.'''

p2_flesch_reading_ease_first_paragraph = textstat.flesch_reading_ease(p2_essay_first_paragraph)
print(f"Flesch Reading Ease: {p2_flesch_reading_ease_first_paragraph}")

Flesch Reading Ease: 69.82


In [30]:
p2_essay_second_paragraph = '''The passage states that Ezeudu is the oldest man in the village and is given great respect. It makes me wonder if others felt the same way he did about Okonkwo and Ikemefuna’s relationship, but he is the only one in position to confront Okonkwo. I feel like Okonkwo acts too much in the present without thinking. He killed Ikemefuna because he did not want to be seen as weak. Ezeudu tells him not to but he does not worry about what Ezeudu will think of him, only what the people around him think.'''

p2_flesch_reading_ease_second_paragraph = textstat.flesch_reading_ease(p2_essay_second_paragraph)
print(f"Flesch Reading Ease: {p2_flesch_reading_ease_second_paragraph}")

Flesch Reading Ease: 77.77


In [31]:
p2_essay_third_paragraph = '''I cannot imagine caring about your own self image so much that you would be willing to kill someone you love. Not only that, Ikemefuna called to Okonkwo when the others attacked him. Ikemefuna trusted Okonkwo. But in the end Okonkwo ran to Ikemefuna and killed him instead of saving him. The hope that Ikemefuna must have felt when he saw his ‘father’ run toward him when he cried must have given way to terror when he saw Okonkwo raise his machete and strike him down.'''

p2_flesch_reading_ease_third_paragraph = textstat.flesch_reading_ease(p2_essay_third_paragraph)
print(f"Flesch Reading Ease: {p2_flesch_reading_ease_third_paragraph}")

Flesch Reading Ease: 79.4


In [32]:
p2_essay_forth_paragraph = '''I cannot imagine caring about your own self image so much that you would be willing to kill someone you love. Not only that, Ikemefuna called to Okonkwo when the others attacked him. Ikemefuna trusted Okonkwo. But in the end Okonkwo ran to Ikemefuna and killed him instead of saving him. The hope that Ikemefuna must have felt when he saw his ‘father’ run toward him when he cried must have given way to terror when he saw Okonkwo raise his machete and strike him down.'''

p2_flesch_reading_ease_forth_paragraph = textstat.flesch_reading_ease(p2_essay_forth_paragraph)
print(f"Flesch Reading Ease: {p2_flesch_reading_ease_forth_paragraph}")

Flesch Reading Ease: 79.4


In [33]:
p2_essay_fifth_paragraph = '''It makes me think of Okonkwo’s relationship with his father. How he hated his father for being lazy, and I wonder if his view of a father figure affected his decision as well. When Ezeudu talked to Okonkwo he said not to kill Ikemefuna because he called Okonkwo father. Maybe in his mind, although he cared for Ikemefuna, he saw himself in a bad light whenever he was called father.'''

p2_flesch_reading_ease_fifth_paragraph = textstat.flesch_reading_ease(p2_essay_fifth_paragraph)
print(f"Flesch Reading Ease: {p2_flesch_reading_ease_fifth_paragraph}")

Flesch Reading Ease: 70.63


In [34]:
p2_essay_sixth_paragraph = '''Reflecting on these moments in the story underscores the complexities of Okonkwo’s character and his conflicting roles as father, warrior, and man driven by the fear of appearing weak. This ultimately shapes his tragic choices, illuminating the broader themes of identity and legacy within the novel.'''

p2_flesch_reading_ease_sixth_paragraph = textstat.flesch_reading_ease(p2_essay_sixth_paragraph)
print(f"Flesch Reading Ease: {p2_flesch_reading_ease_sixth_paragraph}")

Flesch Reading Ease: 39.67


### P3's Essay Components
#### Prompt

In [35]:
p3_prompt = '''Evaluate the ethical implications of architectural design in public spaces, using Tiananmen Square as a case study.'''

#### Outline given by the User

In [36]:
p3_outline_first_paragraph = '''Overview of Tiananmen Square's significance and thesis statement.'''

In [37]:
p3_outline_second_paragraph = '''Explanation of Tiananmen Square's design to serve people and state leaders. Versatility for various events and ceremonies.'''

In [38]:
p3_outline_third_paragraph = '''Description of nearby crucial buildings and their roles. Analysis of how these buildings enhance Tiananmen Square's importance. Detailing the purposes of Tiananmen Square's three entrances. Illustration of the diverse events facilitated through each entrance.'''

In [39]:
p3_outline_forth_paragraph = '''Recap of Tiananmen Square's ethical architecture. Emphasis on its enduring significance in Chinese culture and history'''

#### Human Written Essay

In [87]:
p3_essay_first_paragraph = '''Tiananmen Square is mostly ethical because it was designed to serve the people and the state leaders in many ways, it was surrounded by important buildings, and it had three entrances each serving in a different way. '''

p3_flesch_reading_ease_first_paragraph = textstat.flesch_reading_ease(p3_essay_first_paragraph)
print(f"Flesch Reading Ease: {p3_flesch_reading_ease_first_paragraph}")

Flesch Reading Ease: 42.38


In [88]:
p3_essay_second_paragraph = '''Tiananmen Square can be used as a place for reviewing troops, visiting, and all kinds of parades. In most independence days and army days, China will have a big reviewing troops ceremony going on in Tiananmen Square. Soldiers and all kinds of weaponed cars will march through Tiananmen Square. It shows to the world that China is a powerful country. Tiananmen Square is also a popular visit place for foreigners. There were millions of people visiting every year. It shows both the modern and ancient side of China. A lot of people might not know but Tiananmen Square also has a lot of parades going on every year. Especially during the spring festival, there will be lots of decorated cars and dancing people around Tiananmen Square to celebrate the coming of the new year. '''

p3_flesch_reading_ease_second_paragraph = textstat.flesch_reading_ease(p3_essay_second_paragraph)
print(f"Flesch Reading Ease: {p3_flesch_reading_ease_second_paragraph}")

Flesch Reading Ease: 73.27


In [89]:
p3_essay_third_paragraph = '''Tiananmen Square is located in the center of the city of Beijing which is also surrounded by lots of crucial buildings like the Chinese museum, the Great Hall of People, and the Forbidden City. On the east side of Tiananmen Square stands the Chinese museum. The Chinese museum is the largest comprehensive museum in China. It contains some of China’s most valuable national treasures. To the west is the Great Hall of People where the state leaders and the president hold all the important meetings. Lastly to the north was the Forbidden City. The Forbidden City is one of the biggest wooden structures in the world. It was built 600 years ago and the royal family of ancient China lived there for hundreds of years. Tiananmen Square was surrounded by all of those important architects. '''

p3_flesch_reading_ease_third_paragraph = textstat.flesch_reading_ease(p3_essay_third_paragraph)
print(f"Flesch Reading Ease: {p3_flesch_reading_ease_third_paragraph}")

Flesch Reading Ease: 64.71


In [90]:
p3_essay_forth_paragraph = '''It can be used in ways, it is surrounded by some of the most significant buildings in China, and even its three doors have different meanings. All of these reasons made Tiananmen Square an ethical building. '''

p3_flesch_reading_ease_forth_paragraph = textstat.flesch_reading_ease(p3_essay_forth_paragraph)
print(f"Flesch Reading Ease: {p3_flesch_reading_ease_forth_paragraph}")

Flesch Reading Ease: 70.13


## Token Loss Computation

The function below will figure out the loss for each word in an essay. It breaks the essay down into individual words and uses a model to predict what words might typically come next. For each word, it calculates how surprising that word is compared to what the model expects. This surprise factor, or loss, helps point out parts of the essay where the writing could be different from what's usually seen, showing where there might be room for improvement or unique style choices.

In [91]:
import torch
import pandas as pd

def analyze_text(essay):
    # Tokenize the essay for comparison
    essay_ids = tokenizer.encode(essay, return_tensors='pt').to(model.device)
    
    # Generate logits for the essay input
    with torch.no_grad():
        outputs = model(essay_ids)
        logits = outputs.logits
    
    # Analyze tokens for loss
    spans = []
    highest_loss = float('-inf')
    softmax = torch.nn.Softmax(dim=-1)
    essay_tokens = tokenizer.convert_ids_to_tokens(essay_ids[0])
    
    for i in range(1, essay_ids.size(1)):  # Start from 1 to skip the first token (usually [CLS] or similar)
        probs = softmax(logits[0, i - 1])
        token_loss = -torch.log(probs[essay_ids[0, i]]).item()
        most_likely_token_id = torch.argmax(probs).item()
        token = essay_tokens[i]  # Adjust index for essay tokens
        most_likely_token = tokenizer.decode([most_likely_token_id])
        
        spans.append({
            'original_token': token,
            'token_loss': token_loss,
            'most_likely_token': most_likely_token,
            'loss_ratio': None  # To be calculated later
        })
        
        if token_loss > highest_loss:
            highest_loss = token_loss

    # Normalize loss ratios
    for span in spans:
        span['loss_ratio'] = span['token_loss'] / highest_loss if highest_loss != 0 else 0

    df = pd.DataFrame(spans)
    return df


### Token Loss for Each Original Human Written Paragraph(p1)

In [45]:
p1_analysis_df_w_no_context_first_paragraph = analyze_text(p1_essay_first_paragraph)
display(p1_analysis_df_w_no_context_first_paragraph[['original_token', 'token_loss', 'most_likely_token', 'loss_ratio']])

Unnamed: 0,original_token,token_loss,most_likely_token,loss_ratio
0,Many,76.802307,increa,1.000000
1,▁people,1.230443,people,0.016021
2,▁believe,1.468063,believe,0.019115
3,▁that,0.070375,that,0.000916
4,▁there,4.952903,the,0.064489
...,...,...,...,...
125,▁including,15.139212,.,0.197119
126,▁their,1.828609,the,0.023809
127,▁ancestors,5.650682,families,0.073574
128,.,0.541190,.,0.007047


In [46]:
p1_analysis_df_w_no_context_second_paragraph = analyze_text(p1_essay_second_paragraph)

In [47]:
p1_analysis_df_w_no_context_third_paragraph = analyze_text(p1_essay_third_paragraph)

### Token Loss for Each Paragraph(P2)

In [48]:
p2_analysis_df_w_no_context_first_paragraph = analyze_text(p2_essay_first_paragraph)
display(p2_analysis_df_w_no_context_first_paragraph[['original_token', 'token_loss', 'most_likely_token', 'loss_ratio']])

Unnamed: 0,original_token,token_loss,most_likely_token,loss_ratio
0,When,75.177307,increa,1.000000
1,▁I,4.480940,a,0.059605
2,▁first,4.088663,am,0.054387
3,▁start,1.920110,started,0.025541
4,▁reading,8.119610,a,0.108006
...,...,...,...,...
63,▁is,1.066990,has,0.014193
64,▁going,3.925479,dead,0.052216
65,▁to,0.012439,to,0.000165
66,▁die,2.679891,kill,0.035648


In [49]:
p2_analysis_df_w_no_context_second_paragraph = analyze_text(p2_essay_second_paragraph)

In [50]:
p2_analysis_df_w_no_context_third_paragraph = analyze_text(p2_essay_third_paragraph)

In [51]:
p2_analysis_df_w_no_context_forth_paragraph = analyze_text(p2_essay_forth_paragraph)

In [52]:
p2_analysis_df_w_no_context_fifth_paragraph = analyze_text(p2_essay_fifth_paragraph)

In [53]:
p2_analysis_df_w_no_context_sixth_paragraph = analyze_text(p2_essay_sixth_paragraph)

### Token Loss for Each Paragraph(P3)

In [92]:
p3_analysis_df_w_no_context_first_paragraph = analyze_text(p3_essay_first_paragraph)
display(p3_analysis_df_w_no_context_first_paragraph[['original_token', 'token_loss', 'most_likely_token', 'loss_ratio']])

Unnamed: 0,original_token,token_loss,most_likely_token,loss_ratio
0,Tian,75.677307,increa,1.0
1,an,8.312093,yi,0.109836
2,men,4.037255,",",0.053348
3,▁Square,0.802977,Square,0.010611
4,▁is,1.400564,is,0.018507
5,▁mostly,13.675731,a,0.180711
6,▁ethical,20.971968,known,0.277124
7,▁because,3.359697,?,0.044395
8,▁it,0.720817,it,0.009525
9,▁was,5.215481,is,0.068917


In [55]:
p3_analysis_df_w_no_context_second_paragraph = analyze_text(p3_essay_second_paragraph)

In [56]:
p3_analysis_df_w_no_context_third_paragraph = analyze_text(p3_essay_third_paragraph)

# Different Context Approaches To Create Paragraphs
## Generate an Outline from Human Written Essay & Revise that Outline to Generate a New Paragraph
In this section, we will use a ML model to convert a provided essay paragraph into a detailed outline. This outline will list the main ideas and supporting details. We then will use this outline to help the model write a new paragraph.

### P1's Essay
#### First Paragraph

In [57]:
%%time
doc = f'''Please break this introduction paragraph into a detailed outline format focusing on the thesis statement and a brief introduction: {p1_essay_first_paragraph} '''
model_out = model.generate(
    **tokenizer(doc, return_tensors='pt').to(model.device),
    max_new_tokens=300,
    do_sample=False,
    streamer=streamer
)

<bos>Please break this introduction paragraph into a detailed outline format focusing on the thesis statement and a brief introduction: Many people believe that there is only one story going on in the world, and they are the main characters. They believe that everything should revolve around them, and everybody else in this world is a small side character. We often don't think that there are billions of different stories happening all at the same time, which can then affect our narratives through their perspectives. The authors and filmmakers of Homegoing, Pachinko, and Stories We Tell, use the perspective of many people in similar situations to demonstrate how the narratives of people’s lives can be shaped and defined by the choices they make personally, and the people around them including their ancestors.  

**Thesis Statement:** This paper argues that the multiplicity of stories in the world reflects the richness and complexity of human experience, and that understanding the narrat

In [58]:
doc = """
Using the outline and thesis provided below, compose a comprehensive introductory paragraph. The paragraph should seamlessly integrate the points from the outline into a cohesive narrative that sets the stage for the paper. This introduction should capture the essence of the thesis and outline, providing a clear and engaging overview of the topics to be discussed.

**Thesis Statement:** This paper argues that the multiplicity of stories in the world reflects the richness and complexity of human experience, and that understanding the narratives of others can provide valuable insights into our own lives and the world around us.

**Detailed Outline for Introduction Paragraph:**

- Start by addressing the common belief of a singular narrative in the world, pointing out how prevalent this view is.
- Move on to highlight the underestimation of the multiplicity of stories, emphasizing the diversity and richness of different narratives.
- Introduce the authors and films discussed in the paper, such as Homegoing by Yaa Gyasi, Pachinko by Min Jin Lee, and Stories We Tell by Sarah Polley, setting the stage for an exploration of their narrative techniques.
- Explain how narratives shape our perspectives and understanding of the world, noting the influence of storytelling on personal and collective consciousness.
- Discuss the role of personal choices and the impact of others in shaping these narratives, connecting this discussion to the thesis.
- Highlight how the narratives of Homegoing, Pachinko, and Stories We Tell provide context and meaning, demonstrating the influence of personal agency and the choices made by individuals.
- Conclude by linking these discussions back to the thesis, reaffirming the importance of understanding diverse narratives for personal and collective growth, and setting the expectation for the rest of the paper.
"""

model_out = model.generate(
    **tokenizer(doc, return_tensors='pt').to(model.device),
    max_new_tokens=500,  # Adjusted token limit to allow for a more extensive development
    do_sample=False,  # Set to deterministic
    streamer=streamer
)

<bos>
Using the outline and thesis provided below, compose a comprehensive introductory paragraph. The paragraph should seamlessly integrate the points from the outline into a cohesive narrative that sets the stage for the paper. This introduction should capture the essence of the thesis and outline, providing a clear and engaging overview of the topics to be discussed.

**Thesis Statement:** This paper argues that the multiplicity of stories in the world reflects the richness and complexity of human experience, and that understanding the narratives of others can provide valuable insights into our own lives and the world around us.

**Detailed Outline for Introduction Paragraph:**

- Start by addressing the common belief of a singular narrative in the world, pointing out how prevalent this view is.
- Move on to highlight the underestimation of the multiplicity of stories, emphasizing the diversity and richness of different narratives.
- Introduce the authors and films discussed in the 

In [86]:
p1_generated_first_paragraph_models_outline = '''Despite the pervasive belief of a singular narrative dominating the human experience, the reality is far more nuanced. The multiplicity of stories that populate our world reflects the richness and complexity of human experience, offering a wealth of insights into our own lives and the world around us. This paper delves into the power of narratives, exploring how they shape our perspectives, influence personal and collective consciousness, and provide context and meaning. By examining the narratives of Yaa Gyasi, Min Jin Lee, and Sarah Polley, we will discover how personal choices and the choices of others shape the narratives we embrace, ultimately contributing to our personal growth and collective understanding.'''
p1_analysis_df_w_models_outline_first_paragraph = analyze_text(p1_generated_first_paragraph_models_outline)
display(p1_analysis_df_w_models_outline_first_paragraph[['original_token', 'token_loss']])

Unnamed: 0,original_token,token_loss
0,Despite,78.489807
1,▁the,1.015811
2,▁pervasive,8.685407
3,▁belief,4.847913
4,▁of,3.244974
...,...,...
122,▁growth,2.388786
123,▁and,0.023895
124,▁collective,0.589745
125,▁understanding,0.087458


#### Second Paragraph

In [47]:
%%time
doc = f'''Please convert the following detailed discussion into a structured outline, emphasizing the main thesis about how personal choices shape identities and providing key examples from the texts: {p1_essay_second_paragraph} '''
model_out = model.generate(
    **tokenizer(doc, return_tensors='pt').to(model.device),
    max_new_tokens=128,
    do_sample=False,
    streamer=streamer
)

<bos>Please convert the following detailed discussion into a structured outline, emphasizing the main thesis about how personal choices shape identities and providing key examples from the texts: People’s identities are often defined by the choices they made in the past, whether they are successful or unsuccessful, honorable or dishonorable. The authors and filmmakers of these stories demonstrate how a person’s choices can shape their narratives. In the novel, Pachinko by Min Jin Lee, we mainly see the story through the viewpoint of Sunja, but we will occasionally get to look through or see the thoughts of the people close to her. In the novel, it states, “If he did not marry her, she was a common slut who would be disgraced forever. The child would be another no-name bastard. Her mother’s boardinghouse would be contaminated by her shame”(Lee 49). Sunja made a big mistake by being with Hansu, and she paid the price. She was now pregnant, and the father of the child won’t be able to mar

In [60]:
doc = """
Compose a detailed body paragraph based on the theme that personal choices shape identities.

**Main Idea:** Personal choices play a significant role in shaping identities by influencing the narratives and experiences individuals encounter throughout life.

**Detailed Outline for body Paragraph:**
- The impact of Sunja's decision with Hansu on her and her family's reputation.
- How H's time in prison made him a hardworking and responsible person.
- The effect of Harry's biased view on his own identity and story.
- Tie it back to how these stories illustrate the link between personal choices and societal expectations.
"""

model_out = model.generate(
    **tokenizer(doc, return_tensors='pt').to(model.device),
    max_new_tokens=500,  # Increase token limit to enhance detail
    do_sample=True,  # Enable sampling for more diverse responses
    streamer=streamer
)

<bos>
Compose a detailed body paragraph based on the theme that personal choices shape identities.

**Main Idea:** Personal choices play a significant role in shaping identities by influencing the narratives and experiences individuals encounter throughout life.

**Detailed Outline for body Paragraph:**
- The impact of Sunja's decision with Hansu on her and her family's reputation.
- How H's time in prison made him a hardworking and responsible person.
- The effect of Harry's biased view on his own identity and story.
- Tie it back to how these stories illustrate the link between personal choices and societal expectations.
- Discuss how these experiences shaped the individuals' identities in noticeable ways.

**Body Paragraph:**

Personal choices serve as vibrant threads that intertwine to form intricate identities. They influence the narratives and experiences individuals encounter throughout life, shaping the narratives they tell about themselves and the world around them. Notably, S

In [85]:
p1_generated_second_paragraph_models_outline = '''Personal choices serve as vibrant threads that intertwine to form intricate identities. They influence the narratives and experiences individuals encounter throughout life, shaping the narratives they tell about themselves and the world around them. Notably, Sunja's decision to leave Hansu and their family's reputation hanging had a profound impact on both her and her family. It shifted the narrative around them, highlighting the painful realities of ostracism and the importance of truth. Conversely, H's time in prison transformed him into a hardworking and responsible individual. The confinement stripped away his vices and forced him to confront his mistakes. This period of hardship sculpted his character, making him stronger and more determined to build a better future for himself and his loved ones. Furthermore, Harry's biased view towards his own race had a significant impact on his identity and perception. The constant judgment and prejudice he encountered shaped his self-esteem and made it difficult to form genuine connections. The experience taught him the power of introspection and the importance of challenging biases. These stories illustrate the intricate connection between personal choices and societal expectations. Sunja's decision to set herself free was a testament to her individuality and her refusal to be defined by her association with Hansu. H's struggle in prison challenged his societal conditioning and encouraged him to question his perceptions. And Harry's journey taught him the importance of self-reflection and the need to embrace diversity.'''
p1_analysis_df_w_models_outline_second_paragraph = analyze_text(p1_generated_second_paragraph_models_outline)
display(p1_analysis_df_w_models_outline_second_paragraph[['original_token', 'token_loss']])

Unnamed: 0,original_token,token_loss
0,Personal,79.364807
1,▁choices,13.022726
2,▁serve,11.531310
3,▁as,0.145287
4,▁vibrant,13.321798
...,...,...
275,▁need,0.867162
276,▁to,0.040447
277,▁embrace,5.336337
278,▁diversity,0.254955


#### Third Paragraph

In [188]:
%%time
doc = f'''Please transform the following detailed discussion into a structured outline, emphasizing the thesis on how external influences like family and societal contexts shape individual identities, and include key examples from the texts: {p1_essay_third_paragraph} '''
model_out = model.generate(
    **tokenizer(doc, return_tensors='pt').to(model.device),
    max_new_tokens=300,
    do_sample=False,
    streamer=streamer
)

<bos>Please transform the following detailed discussion into a structured outline, emphasizing the thesis on how external influences like family and societal contexts shape individual identities, and include key examples from the texts: While some believe that their identity is solely based on their choices, people’s identities can be altered by the people around them and by their ancestors. When we start looking through the other lenses of people in the same situation, we can see that many times the expected outcome is different from reality. In the novel, Pachinko, it states, “‘Of course it would be far better for them if she went away’ Yangjin replied, knowing the hard truth. ‘The child would have a terrible life here. You’d be saving my daughter’s life as well'”(Lee 74). Isak knew the dilemma Sunja was in and he knew this was something that shouldn’t be taken lightly. After coming up with the idea of marrying Sunja and giving the child his last name, he had lifted the burden off Su

In [198]:
doc = """
Compose a detailed body paragraph on how external influences shape individual identities, particularly through family and societal contexts. Use direct quotes from 'Pachinko', 'Homegoing', and 'Stories We Tell' to support your analysis.

- From 'Pachinko', discuss how Yangjin's decision reflects societal pressures: "‘Of course it would be far better for them if she went away’ Yangjin replied, knowing the hard truth. ‘The child would have a terrible life here. You’d be saving my daughter’s life as well'" (Lee 74).
- From 'Homegoing', analyze Ness's identity shaped by her forced labor conditions: "Every day, Ness picked cotton under the punishing eye of the southern sun. She had been at Thomas Allan Stockham’s Alabama plantation for three months" (Gyasi 70).
- From 'Stories We Tell', reflect on how Sarah's understanding of her identity changes after learning about her paternity: "Sarah’s identity was shaped by the choices her mother made when she was alive. Sarah never knew that Michael was not her biological father until she grew up" (Documentary).

Conclude by tying these examples back to the broader impact of external influences on personal identity formation, illustrating how these forces dictate individual choices and life paths.
"""

model_out = model.generate(
    **tokenizer(doc, return_tensors='pt').to(model.device),
    max_new_tokens=800,  # Extended limit for comprehensive development
    do_sample=False,  # Disable sampling to focus on generating more accurate responses
    streamer=streamer
)

<bos>
Compose a detailed body paragraph on how external influences shape individual identities, particularly through family and societal contexts. Use direct quotes from 'Pachinko', 'Homegoing', and 'Stories We Tell' to support your analysis.

- From 'Pachinko', discuss how Yangjin's decision reflects societal pressures: "‘Of course it would be far better for them if she went away’ Yangjin replied, knowing the hard truth. ‘The child would have a terrible life here. You’d be saving my daughter’s life as well'" (Lee 74).
- From 'Homegoing', analyze Ness's identity shaped by her forced labor conditions: "Every day, Ness picked cotton under the punishing eye of the southern sun. She had been at Thomas Allan Stockham’s Alabama plantation for three months" (Gyasi 70).
- From 'Stories We Tell', reflect on how Sarah's understanding of her identity changes after learning about her paternity: "Sarah’s identity was shaped by the choices her mother made when she was alive. Sarah never knew that Mi

In [84]:
p1_generated_third_paragraph_models_outline = '''External influences play a significant role in shaping individual identities, particularly through the contexts of family and society. As illustrated through the narratives of 'Pachinko', 'Homegoing', and 'Stories We Tell', individuals are shaped by societal pressures, forced labor conditions, and the revelation of unexpected personal information. In 'Pachinko', Yangjin's decision reflects societal pressures, where a daughter's life is valued more for her potential than her own happiness. Similarly, Ness's identity is shaped by the brutal realities of forced labor, where physical and emotional suffering become defining aspects of her being. The revelation of Sarah's paternity in 'Stories We Tell' highlights the unexpected nature of identity formation. It forces her to confront the complexities of family relationships and the challenges of piecing together her own personal history. These examples illustrate how external influences can disrupt existing identities and necessitate the formation of new ones. In conclusion, external influences shape individual identities by dictating choices and life paths. From societal pressures and forced labor to unexpected personal information, these forces influence how individuals perceive themselves and navigate their place in the world. Understanding these external influences is crucial for comprehending the diversity of human experiences and the complexities of personal identity formation.'''
p1_analysis_df_w_models_outline_third_paragraph = analyze_text(p1_generated_third_paragraph_models_outline)
display(p1_analysis_df_w_models_outline_third_paragraph[['original_token', 'token_loss']])

Unnamed: 0,original_token,token_loss
0,External,79.927307
1,▁influences,7.865835
2,▁play,5.458333
3,▁a,0.020925
4,▁significant,0.450013
...,...,...
240,▁of,0.000416
241,▁personal,2.005162
242,▁identity,0.287790
243,▁formation,0.118826


### P2's Essay

In [134]:
%%time
doc = f'''Please break this introduction paragraph into a detailed outline format focusing on the thesis statement and a brief introduction: {p2_essay_first_paragraph} '''
model_out = model.generate(
    **tokenizer(doc, return_tensors='pt').to(model.device),
    max_new_tokens=300,
    do_sample=False,
    streamer=streamer
)

<bos>Please break this introduction paragraph into a detailed outline format focusing on the thesis statement and a brief introduction: When I first start reading the passage, I think about what Okonkwo is eating. I have never eaten a locust and I am not sure if I would want to. But it is interesting that Okonkwo is eating something considered a rare delicacy when Ezeudu comes to tell him that Ikemefuna is going to die. 

**Thesis Statement:**

The passage suggests that Okonkwo's eating habits reveal his deep connection to nature and his awareness of impending danger.

**Outline:**

**I. Introduction**

- Briefly introduce the passage and its context.
- Highlight the narrator's initial reaction to Okonkwo's eating habits.
- State the thesis statement.

**II. Okonkwo's Eating Habits**

- Discuss Okonkwo's unusual eating preference for locusts.
- Explain the cultural significance of locusts in Igbo society.
- Mention Okonkwo's lack of familiarity with locusts and his potential reluctance

In [136]:
doc = """
Using the outline and thesis provided below, compose a comprehensive introductory paragraph. The paragraph should seamlessly integrate the points from the outline into a cohesive narrative that sets the stage for the paper. This introduction should capture the essence of the thesis and outline, providing a clear and engaging overview of the topics to be discussed.

**Thesis Statement:** The passage suggests that Okonkwo's eating habits reveal his deep connection to nature and his awareness of impending danger.

**Detailed Outline for Introduction Paragraph:**

- Briefly introduce the passage and its context
- Hightlight the narrator's initial reaction to Okonkwo's eating habits
- State the thesis statement
- Discuss Okonkwo's unusual eating preference for locusts
- Explani, the cultural significance of locusts in Igbo society
- Mention Okonkwo's lack of familiarity with locusts and his potential reluctance to consume them.
- Explain that locusts are considered a rare delicacy in Igbo culture.
- Discuss the cultural beliefs surrounding locusts and their connection to death.
- Highlight the potential connection between locusts and Ikemefuna's impending death.
- Summarize the main points of the passage.
- Restate the thesis statement.
- Emphasize the connection between Okonkwo's eating habits and his awareness of danger.
"""

model_out = model.generate(
    **tokenizer(doc, return_tensors='pt').to(model.device),
    max_new_tokens=500,  # Adjusted token limit to allow for a more extensive development
    do_sample=False,  # Set to deterministic
    streamer=streamer
)

<bos>
Using the outline and thesis provided below, compose a comprehensive introductory paragraph. The paragraph should seamlessly integrate the points from the outline into a cohesive narrative that sets the stage for the paper. This introduction should capture the essence of the thesis and outline, providing a clear and engaging overview of the topics to be discussed.

**Thesis Statement:** The passage suggests that Okonkwo's eating habits reveal his deep connection to nature and his awareness of impending danger.

**Detailed Outline for Introduction Paragraph:**

- Briefly introduce the passage and its context
- Hightlight the narrator's initial reaction to Okonkwo's eating habits
- State the thesis statement
- Discuss Okonkwo's unusual eating preference for locusts
- Explani, the cultural significance of locusts in Igbo society
- Mention Okonkwo's lack of familiarity with locusts and his potential reluctance to consume them.
- Explain that locusts are considered a rare delicacy in 

### P3's Essay

In [137]:
%%time
doc = f'''Please break this introduction paragraph into a detailed outline format focusing on the thesis statement and a brief introduction: {p3_essay_first_paragraph} '''
model_out = model.generate(
    **tokenizer(doc, return_tensors='pt').to(model.device),
    max_new_tokens=400,
    do_sample=False,
    streamer=streamer
)

<bos>Please break this introduction paragraph into a detailed outline format focusing on the thesis statement and a brief introduction: Tiananmen Square is mostly ethical because it was designed to serve the people and the state leaders in many ways, it was surrounded by important buildings, and it had three entrances each serving in a different way.  

**Thesis Statement:** While Tiananmen Square may have had some shortcomings, its design and layout ultimately served the people and the state leaders by creating a space for public assembly, facilitating communication, and promoting national unity.

**Outline:**

**I. Introduction**
- Briefly describe Tiananmen Square and its significance.
- State the thesis statement.

**II. Design and Layout**
- Discuss the design features of Tiananmen Square.
- Explain how the layout was intended to serve the people and the state leaders.
- Mention the presence of important buildings surrounding the square.

**III. Function and Impact**
- Explain how

In [138]:
doc = """
Using the outline and thesis provided below, compose a comprehensive introductory paragraph. The paragraph should seamlessly integrate the points from the outline into a cohesive narrative that sets the stage for the paper. This introduction should capture the essence of the thesis and outline, providing a clear and engaging overview of the topics to be discussed.

**Thesis Statement:**  While Tiananmen Square may have had some shortcomings, its design and layout ultimately served the people and the state leaders by creating a space for public assembly, facilitating communication, and promoting national unity.

**Detailed Outline for Introduction Paragraph:**

- Briefly describe Tiananmen Square and its significance.
- State the thesis statement.
- Discuss the design features of Tiananmen Square.
- Explain how the layout was intended to serve the people and the state leaders.
- Mention the presence of important buildings surrounding the square.
- Explain how Tiananmen Square facilitated public assembly.
- Discuss how it promoted communication and interaction among different groups.
- Highlight the role of the three entrances in facilitating different purposes.
- Summarize the main points of the discussion.
- Restate the thesis statement.

"""

model_out = model.generate(
    **tokenizer(doc, return_tensors='pt').to(model.device),
    max_new_tokens=500,  # Adjusted token limit to allow for a more extensive development
    do_sample=False,  # Set to deterministic
    streamer=streamer
)

<bos>
Using the outline and thesis provided below, compose a comprehensive introductory paragraph. The paragraph should seamlessly integrate the points from the outline into a cohesive narrative that sets the stage for the paper. This introduction should capture the essence of the thesis and outline, providing a clear and engaging overview of the topics to be discussed.

**Thesis Statement:**  While Tiananmen Square may have had some shortcomings, its design and layout ultimately served the people and the state leaders by creating a space for public assembly, facilitating communication, and promoting national unity.

**Detailed Outline for Introduction Paragraph:**

- Briefly describe Tiananmen Square and its significance.
- State the thesis statement.
- Discuss the design features of Tiananmen Square.
- Explain how the layout was intended to serve the people and the state leaders.
- Mention the presence of important buildings surrounding the square.
- Explain how Tiananmen Square faci

## Model Generates Paragraph from Human Written Outline
This approach uses a ready-made outline created by a human to help write a detailed paragraph. Unlike the other sections, where maybe the model makes its own outline from an essay and then write a paragrpah, here the outline is already set. This method will show how we can use a clear plan to directly create a well-organized paragraph that discusses specific ideas the writer wants.

### P1's Essay
#### First Paragraph

In [53]:
doc = f"""
Please convert the following outline into a well-structured academic introduction paragraph. The paragraph should integrate all the points cohesively, providing a clear and engaging introduction to the topic.{p1_outline_first_paragraph}
"""

model_out = model.generate(
    **tokenizer(doc, return_tensors='pt').to(model.device),
    max_new_tokens=128,
    do_sample=False,
    streamer=streamer
)

<bos>
Please convert the following outline into a well-structured academic introduction paragraph. The paragraph should integrate all the points cohesively, providing a clear and engaging introduction to the topic.
Thesis: Authors and filmmaker use multiple perspectives to illustrate how personal and ancestral choices shape individual narratives and identities, demonstrating their theories on how history should be constructed.
Brief introduction of works and creators: Discusses Homegoing by Yaa Gyasi, Pachinko by Min Jin Lee, and Stories We Tell by Sarah Polley, setting the stage for an exploration of their narrative techniques.

This paper investigates how authors and filmmakers utilize multiple perspectives to depict the complexities of personal and ancestral choices, challenging traditional narratives and emphasizing the power of individual narratives to shape identity and history. By examining the works of Yaa Gyasi, Min Jin Lee, and Sarah Polley, this paper argues that personal an

In [64]:
p1_generated_first_paragraph_human_outline = '''This paper investigates how authors and filmmakers utilize multiple perspectives to depict the complexities of personal and ancestral choices, challenging traditional narratives and emphasizing the power of individual narratives to shape identity and history. By examining the works of Yaa Gyasi, Min Jin Lee, and Sarah Polley, this paper argues that personal and ancestral choices are not merely individual experiences but are also shaped by broader historical contexts, familial dynamics, and societal narratives.This paper seeks to demonstrate how authors and filmmakers employ multiple perspectives to illuminate the multifaceted nature of human experience, emphasizing the role of personal narratives in understanding the complexities of identity and history.'''
p1_analysis_df_w_human_outline_first_paragraph = analyze_text(p1_generated_first_paragraph_human_outline)
display(p1_analysis_df_w_human_outline_first_paragraph[['original_token', 'token_loss']])

Unnamed: 0,original_token,token_loss
0,This,72.989807
1,▁paper,10.340496
2,▁investigates,2.858632
3,▁how,4.138315
4,▁authors,12.082332
...,...,...
115,▁of,0.022485
116,▁identity,0.638408
117,▁and,0.243705
118,▁history,0.112638


#### Second Paragraph

In [169]:
doc = f"""
Generate a detailed academic body paragraph that explores and expands upon the themes outlined below. The paragraph should delve into the details, providing a thorough analysis and discussion of the specific aspects of the topic mentioned in the outline and illustrated by the referenced paragraph.

Outline:
{p1_outline_second_paragraph}
"""

model_out = model.generate(
    **tokenizer(doc, return_tensors='pt').to(model.device),
    max_new_tokens=300,
    do_sample=False,
    streamer=streamer
)


<bos>
Generate a detailed academic body paragraph that explores and expands upon the themes outlined below. The paragraph should delve into the details, providing a thorough analysis and discussion of the specific aspects of the topic mentioned in the outline and illustrated by the referenced paragraph.

Outline:

Main idea: Personal decisions directly shape characters' identities and futures in all three works, highlighting the authors' and filmmaker's focus on the impact of individual agency within broader historical and social contexts.
Explanation: This exploration of personal choice aligns with the creators' views that history is not merely a series of events but a complex tapestry woven from individual actions and their consequences.
Example from Pachinko: Sunja's pivotal decision to engage with Hansu, and its ramifications, showcase how personal mistakes and moral dilemmas are central to character development and plot progression.
Example from Homegoing: The character H’s experi

In [75]:
p1_generated_second_paragraph_human_outline = '''By examining these works through a lens of personal agency, this analysis underscores the importance of individual choice in shaping narratives and understanding the complex interplay between personal experiences, historical contexts, and societal structures. The narratives of Sunja, H, and Harry in Pachinko, Homegoing, and Stories We Tell reveal a recurring theme of the profound impact of personal decisions on shaping characters' identities and futures. Authors and filmmakers alike emphasize the significance of individual agency within broader historical and social contexts. This exploration aligns with the creators' belief that history is not merely a series of events but a tapestry woven from individual actions and their consequences. Through close examination of these works, we witness how personal choices serve as catalysts for character development and plot progression. Sunja's decision to engage with Hansu, despite societal disapproval, exemplifies how individual agency can lead to unexpected consequences and personal growth. H's experience with forced labor compels him to confront the limitations of his own body and the systemic oppression he has endured. Harry's narrative, marked by his one-sided love affair, highlights how personal perceptions can shape one's identity and the stories they choose to tell.'''
p1_analysis_df_w_human_outline_second_paragraph = analyze_text(p1_generated_second_paragraph_human_outline)
display(p1_analysis_df_w_human_outline_second_paragraph[['original_token', 'token_loss']])

Unnamed: 0,original_token,token_loss
0,By,76.927307
1,▁examining,5.146006
2,▁these,5.455596
3,▁works,7.330101
4,▁through,5.878495
...,...,...
225,▁they,2.904954
226,▁choose,0.116842
227,▁to,0.000144
228,▁tell,0.014521


#### Third Paragraph

In [172]:
doc = f"""
Generate a detailed academic body paragraph that explores and expands upon the themes outlined below. The paragraph should delve into the details, providing a thorough analysis and discussion of the specific aspects of the topic mentioned in the outline and illustrated by the referenced paragraph.

Outline:
{p1_outline_third_paragraph}
"""

model_out = model.generate(
    **tokenizer(doc, return_tensors='pt').to(model.device),
    max_new_tokens=350,
    do_sample=False,
    streamer=streamer
)


<bos>
Generate a detailed academic body paragraph that explores and expands upon the themes outlined below. The paragraph should delve into the details, providing a thorough analysis and discussion of the specific aspects of the topic mentioned in the outline and illustrated by the referenced paragraph.

Outline:

Main idea: Characters’ identities and life paths are significantly influenced by the actions and statuses of those around them and their ancestors, emphasizing the interconnectedness of personal histories within larger societal narratives.
Explanation: This theme underscores the authors' and filmmaker's perspective that individual lives are not isolated but are deeply affected by the historical and relational contexts in which they exist, echoing a broader theory that history is constructed collectively rather than singularly.
Example from Pachinko: Isak’s altruistic decision to marry Sunja provides a stark contrast to her initial dilemma, showing how benevolent actions from 

In [76]:
p1_generated_third_paragraph_human_outline = '''By exploring the interconnectedness of personal histories, this theme underscores the power of social context in shaping individual lives and emphasizes the importance of acknowledging the collective and historical forces that influence personal experiences. The intricate tapestry of human identity is intricately woven with the threads of those around us and our ancestors, creating a complex and interconnected narrative that unfolds through generations. This interconnectedness emerges from the shared experiences, histories, and social contexts that shape the lives of individuals and communities. As illustrated by the narratives of Isak, Esi, and Diane, individuals are not isolated entities but are shaped by the benevolent actions of others, the legacy of the past, and the choices made by their parents and ancestors. This interconnectedness is evident in the choices and decisions made by characters in literary works. Isak’s decision to marry Sunja highlights the transformative power of benevolent actions, redirecting her life trajectory towards a more fulfilling path. Esi and her descendant Ness bear the burden of the legacy of slavery, shaping their identities and opportunities long after the original events have passed. Similarly, Diane’s decision to raise her children based on her own experiences underscores the profound impact of parental choices on children’s identities and their understanding of family narratives. By exploring the interconnectedness of personal histories, this theme underscores the power of social context in shaping individual lives. It challenges the notion of individual agency and emphasizes the role of social forces in determining personal experiences. This understanding is crucial for fostering empathy and promoting social justice, as it encourages us to acknowledge the interconnectedness of our world and to work towards creating a more just and equitable society.'''
p1_analysis_df_w_human_outline_third_paragraph = analyze_text(p1_generated_third_paragraph_human_outline)
display(p1_analysis_df_w_human_outline_third_paragraph[['original_token', 'token_loss']])

Unnamed: 0,original_token,token_loss
0,By,76.927307
1,▁exploring,8.208507
2,▁the,0.343502
3,▁interconnected,5.428399
4,ness,0.016759
...,...,...
317,▁just,0.586601
318,▁and,0.000334
319,▁equitable,0.006663
320,▁society,0.284619


### P2's Essay

In [142]:
doc = f"""
Please convert the following outline into a well-structured academic introduction paragraph. The paragraph should integrate all the points cohesively, providing a clear and engaging introduction to the topic.{p2_outline_first_paragraph}
"""

model_out = model.generate(
    **tokenizer(doc, return_tensors='pt').to(model.device),
    max_new_tokens=300,
    do_sample=False,
    streamer=streamer
)

<bos>
Please convert the following outline into a well-structured academic introduction paragraph. The paragraph should integrate all the points cohesively, providing a clear and engaging introduction to the topic.
Brief introduction to the context of the scene where Okonkwo eats a locust.
Mention of Ezeudu’s arrival and the news he brings about Ikemefuna’s fate.
Okonkwo’s reaction to the locust and its connection to his ancestral heritage.

**Introduction:**

The narrative of Okonkwo’s encounter with a locust serves as a poignant allegory for the complex interplay between ancestral heritage, cultural identity, and the consequences of societal upheaval. As Ezeudu’s arrival marks the arrival of a new era, Okonkwo’s reaction to the locust becomes a lens through which to explore the tensions and transformations that ensue with the disruption of ancestral structures and the disruption of social order. Moreover, the locust serves as a catalyst for Okonkwo’s introspection on his ancestral he

In [145]:
p2_generated_first_paragraph_human_outline = '''The narrative of Okonkwo’s encounter with a locust serves as a poignant allegory for the complex interplay between ancestral heritage, cultural identity, and the consequences of societal upheaval. As Ezeudu’s arrival marks the arrival of a new era, Okonkwo’s reaction to the locust becomes a lens through which to explore the tensions and transformations that ensue with the disruption of ancestral structures and the disruption of social order. Moreover, the locust serves as a catalyst for Okonkwo’s introspection on his ancestral heritage and the connections he has to his community.'''
p2_analysis_df_w_human_outline_first_paragraph = analyze_text(p2_generated_first_paragraph_human_outline)
display(p2_analysis_df_w_human_outline_first_paragraph[['original_token', 'token_loss']])

Unnamed: 0,original_token,token_loss
0,The,69.489807
1,▁narrative,8.640917
2,▁of,3.395686
3,▁Ok,11.922210
4,onk,0.006918
...,...,...
108,▁has,0.537829
109,▁to,1.100608
110,▁his,0.681048
111,▁community,0.176694


### P3's Essay

In [146]:
doc = f"""
Please convert the following outline into a well-structured academic introduction paragraph. The paragraph should integrate all the points cohesively, providing a clear and engaging introduction to the topic.{p3_outline_first_paragraph}
"""

model_out = model.generate(
    **tokenizer(doc, return_tensors='pt').to(model.device),
    max_new_tokens=300,
    do_sample=False,
    streamer=streamer
)

<bos>
Please convert the following outline into a well-structured academic introduction paragraph. The paragraph should integrate all the points cohesively, providing a clear and engaging introduction to the topic.Overview of Tiananmen Square's significance and thesis statement.
## Introduction

The sprawling expanse of Tiananmen Square in Beijing, China, stands as a potent symbol of political dissent, social transformation, and monumental struggle for freedom. Its iconic landscape, characterized by its monumental statues, bustling crowds, and historical significance, has captivated the imaginations of people worldwide. However, beneath the surface of this public space lies a complex tapestry of power dynamics, political tensions, and the enduring legacy of historical struggles. This paper aims to explore the multifaceted significance of Tiananmen Square, arguing that it serves as a crucial platform for dissent, a catalyst for social change, and a testament to the human spirit's resili

In [147]:
p3_generated_first_paragraph_human_outline = '''The sprawling expanse of Tiananmen Square in Beijing, China, stands as a potent symbol of political dissent, social transformation, and monumental struggle for freedom. Its iconic landscape, characterized by its monumental statues, bustling crowds, and historical significance, has captivated the imaginations of people worldwide. However, beneath the surface of this public space lies a complex tapestry of power dynamics, political tensions, and the enduring legacy of historical struggles. This paper aims to explore the multifaceted significance of Tiananmen Square, arguing that it serves as a crucial platform for dissent, a catalyst for social change, and a testament to the human spirit's resilience in the face of adversity. This paper argues that Tiananmen Square holds immense significance as a symbol of political dissent, social transformation, and human resilience in the face of adversity. It contends that the square's iconic landscape, historical significance, and the ongoing struggle for freedom within its boundaries collectively contribute to its enduring power and influence.'''
p3_analysis_df_w_human_outline_first_paragraph = analyze_text(p3_generated_first_paragraph_human_outline)
display(p3_analysis_df_w_human_outline_first_paragraph[['original_token', 'token_loss']])

Unnamed: 0,original_token,token_loss
0,The,69.489807
1,▁sprawling,11.703417
2,▁expanse,5.455361
3,▁of,0.005493
4,▁Tian,10.368990
...,...,...
187,▁enduring,0.200978
188,▁power,1.901733
189,▁and,0.447452
190,▁influence,1.220196


## Generated Introduction Paragraph given Human Written Essay & Human Written Outline
In this approach, we use a detailed outline and complete essay both created by a human to guide the model in writing a new essay(split into paragraphs). This is different from other sectiosn because the outlien and essay are already given as context, and the model combines them to write a paragraph that fits well with the outlined points and the essay's content. This approach should demonstrate how machine learning can effectively use provided materials to make new, well-formed paragraphs that match the original essay's tone and structure

### p1's Essay
#### First Paragraph

In [257]:
doc = f"""
Generate an introduction paragraph leveraging the provided essay context and outline to craft a coherent and engaging narrative. The introduction should seamlessly integrate the themes outlined in the essay context and the structure provided in the outline, offering a compelling overview of the essay's argument and significance.

**Essay Context:** {p1_essay_first_paragraph}

**Outline for Introduction:**
{p1_outline_first_paragraph}

"""

model_out = model.generate(
    **tokenizer(doc, return_tensors='pt').to(model.device),
    max_new_tokens=500,  # Adjusted for potentially longer introductory needs
    do_sample=False,  # Disabled to ensure focus on the task
    streamer=streamer
)

<bos>
Generate an introduction paragraph leveraging the provided essay context and outline to craft a coherent and engaging narrative. The introduction should seamlessly integrate the themes outlined in the essay context and the structure provided in the outline, offering a compelling overview of the essay's argument and significance.

**Essay Context:** Many people believe that there is only one story going on in the world, and they are the main characters. They believe that everything should revolve around them, and everybody else in this world is a small side character. We often don't think that there are billions of different stories happening all at the same time, which can then affect our narratives through their perspectives. The authors and filmmakers of Homegoing, Pachinko, and Stories We Tell, use the perspective of many people in similar situations to demonstrate how the narratives of people’s lives can be shaped and defined by the choices they make personally, and the peopl

In [79]:
p1_generated_first_paragraph_human_outline_and_essay = '''In the tapestry of human existence, where narratives intertwine, revealing the complexities of our individual journeys, we often find ourselves questioning the singular, dominant narrative that pervades our understanding of the world. While the individual may hold significant power in shaping their personal narrative, the broader context of history and the choices of those who came before them also play a crucial role in defining our identities and destinies. This essay explores how authors and filmmakers utilize multiple perspectives to illustrate how personal and ancestral choices shape individual narratives and identities, challenging our assumptions about the nature of history and storytelling.'''
p1_analysis_df_w_human_outline_and_essay_first_paragraph = analyze_text(p1_generated_first_paragraph_human_outline_and_essay)
display(p1_analysis_df_w_human_outline_and_essay_first_paragraph[['original_token', 'token_loss']])

Unnamed: 0,original_token,token_loss
0,In,72.739807
1,▁the,0.542588
2,▁tapestry,8.662266
3,▁of,0.000673
4,▁human,2.316590
...,...,...
107,▁of,0.007170
108,▁history,5.449386
109,▁and,0.034599
110,▁storytelling,4.338239


#### Second Paragraph

In [152]:
doc = f"""
In this section, we aim to construct a comprehensive academic body paragraph that delves into the intricate themes presented in the provided outline and related essay context. The paragraph will thoroughly analyze and discuss the specific elements highlighted in the outline, all while drawing on examples from the referenced essay to enrich the discussion.

**Essay Context:** {p1_essay_second_paragraph}

**Outline for Second Paragraph:**
{p1_outline_second_paragraph}
"""

model_out = model.generate(
    **tokenizer(doc, return_tensors='pt').to(model.device),
    max_new_tokens=500,  # Adjusted for potentially longer paragraph needs
    do_sample=False,  # Disabled to ensure focus on the task
    streamer=streamer
)

<bos>
In this section, we aim to construct a comprehensive academic body paragraph that delves into the intricate themes presented in the provided outline and related essay context. The paragraph will thoroughly analyze and discuss the specific elements highlighted in the outline, all while drawing on examples from the referenced essay to enrich the discussion.

**Essay Context:** People’s identities are often defined by the choices they made in the past, whether they are successful or unsuccessful, honorable or dishonorable. The authors and filmmakers of these stories demonstrate how a person’s choices can shape their narratives. In the novel, Pachinko by Min Jin Lee, we mainly see the story through the viewpoint of Sunja, but we will occasionally get to look through or see the thoughts of the people close to her. In the novel, it states, “If he did not marry her, she was a common slut who would be disgraced forever. The child would be another no-name bastard. Her mother’s boardinghou

In [80]:
p1_generated_second_paragraph_human_outline_and_essay = '''In conclusion, the exploration of personal choice in these three works underscores the transformative power of individual agency within broader historical and social contexts. By examining how characters navigate their choices and their consequences, these narratives challenge the notion of predetermined destinies and emphasize the importance of personal responsibility and agency in shaping one’s identity and future.'''
p1_analysis_df_w_human_outline_and_essay_second_paragraph = analyze_text(p1_generated_second_paragraph_human_outline_and_essay)
display(p1_analysis_df_w_human_outline_and_essay_second_paragraph[['original_token', 'token_loss']])

Unnamed: 0,original_token,token_loss
0,In,7.273981e+01
1,▁conclusion,1.360509e+01
2,",",3.404864e-03
3,▁the,7.178910e-01
4,▁exploration,5.447742e+00
...,...,...
58,s,8.344654e-07
59,▁identity,3.248956e+00
60,▁and,1.116658e-01
61,▁future,1.100144e+00


#### Third Paragraph

In [265]:
doc = f"""
Generate a detailed academic body paragraph that explores and expands upon the themes outlined below. The paragraph should delve into the details, providing a thorough analysis and discussion of the specific aspects of the topic mentioned in the outline and illustrated by the referenced paragraph.

**Essay Context:** {p1_essay_third_paragraph}

**Outline for Second Paragraph:**
{p1_outline_third_paragraph}
"""

model_out = model.generate(
    **tokenizer(doc, return_tensors='pt').to(model.device),
    max_new_tokens=600,  # Adjusted for potentially longer paragraph needs
    do_sample=False,  # Disabled to ensure focus on the task
    streamer=streamer
)

<bos>
Generate a detailed academic body paragraph that explores and expands upon the themes outlined below. The paragraph should delve into the details, providing a thorough analysis and discussion of the specific aspects of the topic mentioned in the outline and illustrated by the referenced paragraph.

**Essay Context:** While some believe that their identity is solely based on their choices, people’s identities can be altered by the people around them and by their ancestors. When we start looking through the other lenses of people in the same situation, we can see that many times the expected outcome is different from reality. In the novel, Pachinko, it states, “‘Of course it would be far better for them if she went away’ Yangjin replied, knowing the hard truth. ‘The child would have a terrible life here. You’d be saving my daughter’s life as well'”(Lee 74). Isak knew the dilemma Sunja was in and he knew this was something that shouldn’t be taken lightly. After coming up with the id

In [81]:
p1_generated_third_paragraph_human_outline_and_essay = '''By exploring how characters’ identities are shaped by the actions and contexts of others, this theme underscores the interconnectedness of personal histories and the role of history in shaping individual lives.'''
p1_analysis_df_w_human_outline_and_essay_third_paragraph = analyze_text(p1_generated_third_paragraph_human_outline_and_essay)
display(p1_analysis_df_w_human_outline_and_essay_third_paragraph[['original_token', 'token_loss']])

Unnamed: 0,original_token,token_loss
0,By,76.927307
1,▁exploring,8.208507
2,▁how,4.609127
3,▁characters,7.782189
4,’,5.707983
5,▁identities,5.345525
6,▁are,1.201509
7,▁shaped,0.23146
8,▁by,0.643914
9,▁the,2.139013


### P2's Essay

In [148]:
doc = f"""
Generate an introduction paragraph leveraging the provided essay context and outline to craft a coherent and engaging narrative. The introduction should seamlessly integrate the themes outlined in the essay context and the structure provided in the outline, offering a compelling overview of the essay's argument and significance.

**Essay Context:** {p2_essay_first_paragraph}

**Outline for Introduction:**
{p2_outline_first_paragraph}

"""

model_out = model.generate(
    **tokenizer(doc, return_tensors='pt').to(model.device),
    max_new_tokens=500,  # Adjusted for potentially longer introductory needs
    do_sample=False,  # Disabled to ensure focus on the task
    streamer=streamer
)

<bos>
Generate an introduction paragraph leveraging the provided essay context and outline to craft a coherent and engaging narrative. The introduction should seamlessly integrate the themes outlined in the essay context and the structure provided in the outline, offering a compelling overview of the essay's argument and significance.

**Essay Context:** When I first start reading the passage, I think about what Okonkwo is eating. I have never eaten a locust and I am not sure if I would want to. But it is interesting that Okonkwo is eating something considered a rare delicacy when Ezeudu comes to tell him that Ikemefuna is going to die.

**Outline for Introduction:**

Brief introduction to the context of the scene where Okonkwo eats a locust.
Mention of Ezeudu’s arrival and the news he brings about Ikemefuna’s fate.

**Introduction:**

The passage presents a tantalizing glimpse into the cultural significance of locust consumption in Igbo society. While Okonkwo, a revered warrior, faces

In [150]:
p2_generated_first_paragraph_human_outline_and_essay = '''The passage presents a tantalizing glimpse into the cultural significance of locust consumption in Igbo society. While Okonkwo, a revered warrior, faces a dilemma due to Ezeudu’s ominous arrival, the act of consuming locust transcends mere sustenance. It becomes a potent symbol of both defiance and vulnerability, reflecting the complex interplay between tradition, modernity, and the human condition. Through this seemingly mundane act, the passage invites us to ponder the nature of delicacy, the power of cultural identity, and the precariousness of existence.'''
p2_analysis_df_w_human_outline_and_essay_first_paragraph = analyze_text(p2_generated_first_paragraph_human_outline_and_essay)
display(p2_analysis_df_w_human_outline_and_essay_first_paragraph[['original_token', 'token_loss']])

Unnamed: 0,original_token,token_loss
0,The,69.489807
1,▁passage,3.172167
2,▁presents,3.454353
3,▁a,0.494193
4,▁tantal,12.497955
...,...,...
99,▁precarious,4.997609
100,ness,1.804892
101,▁of,0.006073
102,▁existence,0.246460


### P3's Essay

In [149]:
doc = f"""
Generate an introduction paragraph leveraging the provided essay context and outline to craft a coherent and engaging narrative. The introduction should seamlessly integrate the themes outlined in the essay context and the structure provided in the outline, offering a compelling overview of the essay's argument and significance.

**Essay Context:** {p3_essay_first_paragraph}

**Outline for Introduction:**
{p3_outline_first_paragraph}

"""

model_out = model.generate(
    **tokenizer(doc, return_tensors='pt').to(model.device),
    max_new_tokens=500,  # Adjusted for potentially longer introductory needs
    do_sample=False,  # Disabled to ensure focus on the task
    streamer=streamer
)

<bos>
Generate an introduction paragraph leveraging the provided essay context and outline to craft a coherent and engaging narrative. The introduction should seamlessly integrate the themes outlined in the essay context and the structure provided in the outline, offering a compelling overview of the essay's argument and significance.

**Essay Context:** Tiananmen Square is mostly ethical because it was designed to serve the people and the state leaders in many ways, it was surrounded by important buildings, and it had three entrances each serving in a different way. 

**Outline for Introduction:**
Overview of Tiananmen Square's significance and thesis statement.

**Introduction:**

Standing tall in Beijing, Tiananmen Square embodies the grandeur and aspirations of a nation. Yet, beneath its majestic facade lies a complex tapestry of historical significance and contested narratives. While often lauded for its symbolic role in upholding democracy and human rights, the square's design an

In [151]:
p3_generated_first_paragraph_human_outline_and_essay = '''Standing tall in Beijing, Tiananmen Square embodies the grandeur and aspirations of a nation. Yet, beneath its majestic facade lies a complex tapestry of historical significance and contested narratives. While often lauded for its symbolic role in upholding democracy and human rights, the square's design and layout raise questions about its true purpose and potential impact on the surrounding community. This essay argues that while the intentions behind the square's design were undoubtedly noble, its current form perpetuates a sense of alienation and disconnection within the community. By exploring the interplay between historical context, architectural design, and social dynamics, this essay seeks to challenge the simplistic narrative of Tiananmen Square as a symbol of ethical conduct and highlight its potential to perpetuate social inequalities.'''
p3_analysis_df_w_human_outline_and_essay_first_paragraph = analyze_text(p3_generated_first_paragraph_human_outline_and_essay)
display(p3_analysis_df_w_human_outline_and_essay_first_paragraph[['original_token', 'token_loss']])

Unnamed: 0,original_token,token_loss
0,Standing,74.302307
1,▁tall,1.614646
2,▁in,1.310400
3,▁Beijing,10.723093
4,",",0.418464
...,...,...
141,▁to,0.419028
142,▁perpetuate,1.306165
143,▁social,1.258569
144,▁inequalities,1.200097


# Essay Results

# Original Essay for Reference
This section focuses on the analysis of a single essy to maintain clarity and orgnanization throughout the project.

In [93]:
print("Original Essay(P1):\n\n", p1_essay_first_paragraph + "\n\n" + p1_essay_second_paragraph + "\n\n" + p1_essay_third_paragraph)

Original Essay(P1):

 Many people believe that there is only one story going on in the world, and they are the main characters. They believe that everything should revolve around them, and everybody else in this world is a small side character. We often don't think that there are billions of different stories happening all at the same time, which can then affect our narratives through their perspectives. The authors and filmmakers of Homegoing, Pachinko, and Stories We Tell, use the perspective of many people in similar situations to demonstrate how the narratives of people’s lives can be shaped and defined by the choices they make personally, and the people around them including their ancestors. 

People’s identities are often defined by the choices they made in the past, whether they are successful or unsuccessful, honorable or dishonorable. The authors and filmmakers of these stories demonstrate how a person’s choices can shape their narratives. In the novel, Pachinko by Min Jin Lee

## Generated Introduction Paragraph Based On Most_likely_token(NO CONTEXT, just most_likely_token Chosen)

In [94]:
p1_most_likely_token_first_paragraph = "".join(p1_analysis_df_w_no_context_first_paragraph['most_likely_token'])
p1_most_likely_token_second_paragraph = "".join(p1_analysis_df_w_no_context_second_paragraph['most_likely_token'])
p1_most_likely_token_third_paragraph = "".join(p1_analysis_df_w_no_context_third_paragraph['most_likely_token'])

print("Revised Essay Based on Most Likely Tokenn/No Context(P1):\n", p1_most_likely_token_first_paragraph + "\n" + p1_most_likely_token_second_paragraph + "\n" + p1_most_likely_token_third_paragraph)

Revised Essay Based on Most Likely Tokenn/No Context(P1):
  increa people believe that the is a one true to on in the universe. and that call unwilling ones characters in This believe that their that fit around them and and that else is the story is simply supporting character character.

 need see't see about there are multiple of other stories happening at at once same time, but is make lead our perception. the focus and

 idea of artists who these Alone explore Chiinko, and Things of Tell use for their concept of African characters to the situations to explore the the power are these ofs lives intertw intertw interconnected by influenced by the choices they make. and and how impact they them. the families.


 increa oftens Republic are shaped shaped by their roles they make in life past. and consciously were good or unsuccessful. and or dishonorable.

 choices of editors who this narratives often a choices person’s identity define shape their destiny and

 this context “ “inko, Y Ji

## Generate an Outline from Human Written Essay & Revise that Outline to Generate a New Paragraph
### P1's Essay

In [95]:
print("\nEssay Generated by Model:\n", p1_generated_first_paragraph_models_outline + "\n" + p1_generated_second_paragraph_models_outline + "\n" + p1_generated_third_paragraph_models_outline)


Essay Generated by Model:
 Despite the pervasive belief of a singular narrative dominating the human experience, the reality is far more nuanced. The multiplicity of stories that populate our world reflects the richness and complexity of human experience, offering a wealth of insights into our own lives and the world around us. This paper delves into the power of narratives, exploring how they shape our perspectives, influence personal and collective consciousness, and provide context and meaning. By examining the narratives of Yaa Gyasi, Min Jin Lee, and Sarah Polley, we will discover how personal choices and the choices of others shape the narratives we embrace, ultimately contributing to our personal growth and collective understanding.
Personal choices serve as vibrant threads that intertwine to form intricate identities. They influence the narratives and experiences individuals encounter throughout life, shaping the narratives they tell about themselves and the world around them.

## Generated Introduction Paragraph Given Desired Human Outline

In [96]:
print("\nEssay Generated by Model:\n\n", p1_generated_first_paragraph_human_outline + "\n\n" + p1_generated_second_paragraph_human_outline + "\n\n" + p1_generated_third_paragraph_human_outline)


Essay Generated by Model:

 This paper investigates how authors and filmmakers utilize multiple perspectives to depict the complexities of personal and ancestral choices, challenging traditional narratives and emphasizing the power of individual narratives to shape identity and history. By examining the works of Yaa Gyasi, Min Jin Lee, and Sarah Polley, this paper argues that personal and ancestral choices are not merely individual experiences but are also shaped by broader historical contexts, familial dynamics, and societal narratives.This paper seeks to demonstrate how authors and filmmakers employ multiple perspectives to illuminate the multifaceted nature of human experience, emphasizing the role of personal narratives in understanding the complexities of identity and history.

By examining these works through a lens of personal agency, this analysis underscores the importance of individual choice in shaping narratives and understanding the complex interplay between personal expe

## Genereated Introduction Paragraph Given Human Essay & Outline

In [97]:
print("\nEssay Generated by Model:\n\n", p1_generated_first_paragraph_human_outline_and_essay + "\n\n" + p1_generated_second_paragraph_human_outline_and_essay + "\n\n" + p1_generated_third_paragraph_human_outline_and_essay)


Essay Generated by Model:

 In the tapestry of human existence, where narratives intertwine, revealing the complexities of our individual journeys, we often find ourselves questioning the singular, dominant narrative that pervades our understanding of the world. While the individual may hold significant power in shaping their personal narrative, the broader context of history and the choices of those who came before them also play a crucial role in defining our identities and destinies. This essay explores how authors and filmmakers utilize multiple perspectives to illustrate how personal and ancestral choices shape individual narratives and identities, challenging our assumptions about the nature of history and storytelling.

In conclusion, the exploration of personal choice in these three works underscores the transformative power of individual agency within broader historical and social contexts. By examining how characters navigate their choices and their consequences, these narra

# Revising the Essay based on Token Loss

In [127]:
def create_optimized_paragraph_based_on_token_loss(original_df, generated_df):
    optimized_tokens = []

    original_df = original_df.reset_index(drop=True)
    generated_df = generated_df.reset_index(drop=True)

    for index, orig_row in original_df.iterrows():
        if index < len(generated_df):
            gen_row = generated_df.iloc[index]
            if orig_row['token_loss'] <= gen_row['token_loss']:
                optimized_tokens.append(orig_row['original_token'])
            else:
                optimized_tokens.append(gen_row['original_token'])
        else:
            optimized_tokens.append(orig_row['original_token'])

    optimized_paragraph = ''.join(optimized_tokens).replace('▁', ' ')
    return optimized_paragraph

### Generate an Outline from Human Written Essay & Revise that Outline to Generate a New Paragraph


In [118]:
p1_generated_first_paragraph_models_outline_df = analyze_text(p1_generated_first_paragraph_models_outline)
optimized_paragraph = create_optimized_paragraph_based_on_token_loss(
    p1_analysis_df_w_no_context_first_paragraph, 
    p1_generated_first_paragraph_models_outline_df
)

print("Optimized Paragraph:")
print(optimized_paragraph)

Optimized Paragraph:
Many the believe that of is singular one dominating the on in, the, is far more nuanced. characters. of believe that everything our world around the richness and complexity of human experience world is a wealth of character. our often lives't world around us. This of delves into the at of same time, how they shape our our, influence their and collective The, and provide of and meaning, Byinko, and of We Tell Gyasi, perspective of Lee, and similar situationsley, we the discover how personal choicess lives choices of others shape the by we choices they make contributing to and the growth and them understanding. ancestors. 


## Generated Introduction Paragraph Given Desired Human Outline

In [125]:
p1_generated_first_paragraph_human_outline_df = analyze_text(p1_generated_first_paragraph_human_outline)
optimized_paragraph = create_optimized_paragraph_based_on_token_loss(
    p1_analysis_df_w_no_context_first_paragraph, 
    p1_generated_first_paragraph_human_outline_df
)

print("Optimized Paragraph:")
print(optimized_paragraph)

Optimized Paragraph:
This people believe that there is only one multiple perspectives on in the complexities of and and are the, characters. narratives and that the power of around them, shape identity else history. world is the works of characteraa Gyasi,'t Lee, and are Polley, this paper argues that the same time, are not merely individual our but are also shaped by broader historical contexts, familial dynamics, and societalinko, and paper seeks to demonstrate how authors and filmmakers employ multiple perspectives to situations the multifaceted nature of human experience, emphasizings lives of be narratives in understanding by complexities of they make history. and the people around them including their ancestors. 


### Generated Introduction Paragraph Given Human Outline & Essay

In [126]:
p1_generated_first_paragraph_human_outline_and_essay_df = analyze_text(p1_generated_first_paragraph_human_outline_and_essay)
optimized_paragraph = create_optimized_paragraph_based_on_token_loss(
    p1_analysis_df_w_no_context_second_paragraph, 
    p1_generated_first_paragraph_human_outline_and_essay_df
)

print("Optimized Paragraph:")
print(optimized_paragraph)

Optimized Paragraph:
In thes of are existence, by the intertwine, in the past of our individual journeys, or unsuccessful find ourselves or dishonorable. The narrative that pervades our understanding of how world.’s choices can shape their power in shaping their personal narrative,inko by Min Jin Lee, the choices of the who came before them of play a, role in defining our get and destinies. This essay the how of and people utilize to her. In the personal, it choices shape “ narratives and did not challenging her, about the nature of history and would. disgraced forever. The child would be another no-name bastard. Her mother’s boardinghouse would be contaminated by her shame”(Lee 49). Sunja made a big mistake by being with Hansu, and she paid the price. She was now pregnant, and the father of the child won’t be able to marry her because he is already married. In her society, bearing a child without a father will lead to having the mother, in this case, Sunja, be disowned. It could also 

# Results(Quantitative)
## BLEU SCORES
The BLEU score helps us measure the linguistic similarity between the human-written essays and the AI-generated ones. It focuses on how well the model reproduces phrases and sentences that appear in the original text. A higher BLEU score means the generated essay is closer in language to the human essay, which is crucial for ensuring the model captures the intended meaning and style.

In [132]:
def calculate_bleu(original_text, generated_text):
    smoothie = SmoothingFunction().method4
    original_sentences = sent_tokenize(original_text)
    generated_sentences = sent_tokenize(generated_text)
    
    bleu_scores = [sentence_bleu([word_tokenize(orig)], word_tokenize(gen), smoothing_function=smoothie) for orig, gen in zip(original_sentences, generated_sentences)]
    
    average_bleu = sum(bleu_scores) / len(bleu_scores) if bleu_scores else 0
    return average_bleu

In [133]:
original_essay = p1_essay_first_paragraph + "\n\n" + p1_essay_second_paragraph + "\n\n" + p1_essay_third_paragraph

no_context_text = p1_most_likely_token_first_paragraph + "\n" + p1_most_likely_token_second_paragraph + "\n" + p1_most_likely_token_third_paragraph
model_outline_text = p1_generated_first_paragraph_models_outline + "\n" + p1_generated_second_paragraph_models_outline + "\n" + p1_generated_third_paragraph_models_outline
human_outline_text = p1_generated_first_paragraph_human_outline + "\n" + p1_generated_second_paragraph_human_outline + "\n" + p1_generated_third_paragraph_human_outline
model_and_human_outline_text = p1_generated_first_paragraph_human_outline_and_essay + "\n" + p1_generated_second_paragraph_human_outline_and_essay + "\n" + p1_generated_third_paragraph_human_outline_and_essay

bleu_scores = {
    'No Context': calculate_bleu(original_essay, no_context_text),
    'Model Outline': calculate_bleu(original_essay, model_outline_text),
    'Human Outline': calculate_bleu(original_essay, human_outline_text),
    'Model and Human Outline': calculate_bleu(original_essay, model_and_human_outline_text)

}

# Print BLEU scores
for context, score in bleu_scores.items():
    print(f"BLEU Score for {context}: {score:.4f}")

BLEU Score for No Context: 0.1311
BLEU Score for Model Outline: 0.1941
BLEU Score for Human Outline: 0.1860
BLEU Score for Model and Human Outline: 0.1930


## FLESH READING EASE SCORES
This readability test is used to determine how easy it is to understand the essays generated by the model. It gives us insights into whether the text is accessible and engaging for readers, which is especially important in an educational context where clarity and comprehension are key.

In [273]:
def calculate_flesch_reading(text):
    score = textstat.flesch_reading_ease(text)
    return score

In [274]:
flesch_score_no_context = calculate_flesch_reading(no_context_text)
flesch_score_model_outline = calculate_flesch_reading(model_outline_text)
flesch_score_human_outline = calculate_flesch_reading(human_outline_text)
flesch_score_model_and_human_outline = calculate_flesch_reading(model_and_human_outline_text)

print("Flesch Reading Ease Score for No Context:", flesch_score_no_context)
print("Flesch Reading Ease Score for Model Outline:", flesch_score_model_outline)
print("Flesch Reading Ease Score for Human Outline:", flesch_score_human_outline)
print("Flesch Reading Ease Score for Model and Human Outline:", flesch_score_model_and_human_outline)

Flesch Reading Ease Score for No Context: 68.5
Flesch Reading Ease Score for Model Outline: 16.02
Flesch Reading Ease Score for Human Outline: 19.4
Flesch Reading Ease Score for Model and Human Outline: 14.12


## Result Analysis:
Our analysis begins with the first approach, where the model generated text without specific context, relying solely on the most likely token choices. This approach yielded a BLEU score of 0.1311, suggesting a low similarity to the original academic texts, though it was relatively easy to read with a Flesch Reading Ease score of 68.5. We then experimented with generating an outline from a human-written essay and subsequently revising that outline to create a new paragraph. This method resulted in a BLEU score of 0.1902, indicating better alignment with the academic style, although the readability significantly decreased, evidenced by a Flesch score of 16.02. This drop suggests that the text became more complex and structured, aligning more closely with traditional academic writing. Another approach involved using a human-written outline directly to generate text, which produced a BLEU score of 0.1860 and a slightly improved readability score of 19.4, showing that using human-driven outlines helps maintain a balance between content accuracy and complexity. Lastly, providing both a human-written essay and an outline to guide the generation resulted in the highest BLEU score of 0.1930, indicating the closest match to the original style of academic writing, with the Flesch score reaching the lowest at 14.12, which suggests the text was quite dense and possibly the most challenging to read. Each method has demonstrated unique strengths and weaknesses, making the choice of approach dependent on whether readability or adherence to academic style is more crucial for the specific needs of your project.

# Limitations & Future Directions
## Data Limitation: 
The quality and extent of the dataset used for training the machine learning model greatly influence the performance and reliability of the generated essays. Limited or biased data can skew the model's understanding and output.
## Human Factor in Data Collection & Training:
The involvement of humans in data collection and model training introduces subjective biases. These biases can affect the model’s ability to generate neutral and universally applicable content.
## Content Selection:
The selection of topics and materials included in the training set can restrict the diversity and scope of the model’s output. This limitation could lead to a lack of comprehensive coverage on less familiar or new topics.
## Human Revision & Quality of Generated Essay:
The dependency on human revision to ensure the relevance and accuracy of AI-generated content highlights a significant limitation in current AI capabilities. This process can be resource-intensive and may not always guarantee consistency in quality.

# Conclusion
This project revealed challenges and ethical concerns with using AI in academic writing, notably the significant role of human oversight in data preparation and the limitations of numerical scores for evaluating essays. Looking forward, enhancing AI's understanding of nuanced text and ensuring transparent, ethical processes are crucial. Future research should focus on developing better evaluation metrics and more effective human-AI collaboration to improve AI's educational applications.