# Attempting to increase the leveling capabilities of GPT-4-Turbo with added context

This is a continuation of the previous study that tested the gpt-4-tubo ability to produce text at proper reading level. Last study we found that the text was generally more complex that required. However, the leveler had limited information and context on reading levels. For this test, we will be adding clear descriptions of requirements for texts. Still limited, however, more information.

Approach:
1. Gather initial data(Generated texts and their structure)
3. Convert score to mean of grade range
4. Compare where the model thinks it was outputting to where it actually did according to reability score
5. Compare if the texts that was leveled after generation improved in its score


# Data

The data was created in 3 ways. 
1. The leveled text guidelines were pulled from Planning Period's text leveler database
2. The Generated text and leveled Text were API calls to openAI
3. The Flesch-Kincaid score was calcualted using the textstat lib for both texts

Characteristics of each call:
- Each model calls has a system prompt message and a final prompt message
- the leveled passage prompt has more context with reading level requirements. The original passage does not. 

Steps:
- The first call generates text that is either fiction or nonfiction. It is given specific directions that help it tailor the texts for generating passages for k-12 education.
- The second call is given directions to level the passage that was created in the first call. The idea is that if it is not focusing on creating a passage, it can focus on leveling to proper reading level. This is the prompt with more information

limitations:
- grades range from 5th-12th


# Features 

- __Type of Passage__
- Original Grade
- Leveled Grade
- __Topic__
- Original Text
- Leveled Text
- Original Word Count
- Original Sentence Structure
- Original Vocabulary
- Original Content
- __Leveled Word Count__
- __Leveled Sentence Structure__
- __Leveled Vocabulary__
- __Leveled Content__
- Original Text Flesch Kincaid Readability Score
- Leveled Text Flesch Kincaid Readability Score

Most of these features are for reference and were __not__ given to the model in the api call. The ones that are __bolded__ are the only ones that were given as context in the api call


# What is Flesch-Kincaid 

The Flesch–Kincaid readability tests are readability tests designed to indicate how difficult a passage in English is to understand. There are two tests: the Flesch Reading-Ease, and the Flesch–Kincaid Grade Level. Although they use the same core measures (word length and sentence length), they have different weighting factors.

The results of the two tests correlate approximately inversely: a text with a comparatively high score on the Reading Ease test should have a lower score on the Grade-Level test. Rudolf Flesch devised the Reading Ease evaluation; somewhat later, he and J. Peter Kincaid developed the Grade Level evaluation for the United States Navy.

Below we can see the ranges for each level

![image.png](attachment:582104ce-a5ea-4da1-9eef-18e364b03b2e.png)

source - https://en.wikipedia.org/wiki/Flesch%E2%80%93Kincaid_readability_tests


In [2]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np


# Ensures that plots appear in the notebook itself
%matplotlib inline


In [3]:
df = pd.read_csv('updated_data.csv')  
display(df.head())

Unnamed: 0,Type of Passage,Original Grade,Leveled Grade,Topic,Original Text,Leveled Text,Original Word Count,Original Sentence Structure,Original Vocabulary,Original Content,Leveled Word Count,Leveled Sentence Structure,Leveled Vocabulary,Leveled Content,Original Text Flesch Kincaid Readability Score,Leveled Text Flesch Kincaid Readability Score
0,Fiction,5th,7th,A Lost Treasure,"Unfortunately, without the text of ""A Lost Tre...","**Hypothetical Example: ""A Lost Treasure""**\n\...",250-300 words,Greater variation in sentence structure,Introduction to figurative language,Expanding on character development and introdu...,375-425 words,Rich variation with emphasis on rhetorical dev...,Advanced vocabulary with more sophisticated la...,Exploration of complex themes and multiple sto...,62.78,46.67
1,Fiction,5th,10th,Time Travel Adventure,"Unfortunately, I can't assist with rewriting o...",### Time Travel Adventure\n\nIn the shadowed c...,250-300 words,Greater variation in sentence structure,Introduction to figurative language,Expanding on character development and introdu...,575-650 words,Emphasis on cohesion and sophisticated structures,Advanced and nuanced vocabulary,Intensive exploration of complex literary them...,72.76,45.46
2,Fiction,5th,6th,A Haunted House Mystery,"To proceed with your request, I'll need the te...","To proceed with the task you've requested, I'l...",250-300 words,Greater variation in sentence structure,Introduction to figurative language,Expanding on character development and introdu...,300-375 words,Balanced mix of sentence structures,Introduction to more abstract and nuanced voca...,Introduction to themes and symbolism in stories,49.65,36.32
3,Fiction,5th,9th,An Epic Battle Between Kingdoms,**An Epic Battle Between Kingdoms**\n\nA long ...,**An Epic Battle Between Kingdoms**\n\nIn an e...,250-300 words,Greater variation in sentence structure,Introduction to figurative language,Expanding on character development and introdu...,500-575 words,Advanced and varied sentence structures,Rich and sophisticated vocabulary,Exploration of literary techniques and deeper ...,75.2,53.1
4,Fiction,5th,5th,Survival on an Alien Planet,**Survival on an Alien Planet**\n\nWhen Jamie ...,**Survival on an Alien Planet**\n\nJamie and h...,250-300 words,Greater variation in sentence structure,Introduction to figurative language,Expanding on character development and introdu...,250-300 words,Greater variation in sentence structure,Introduction to figurative language,Expanding on character development and introdu...,84.68,76.01


# Converting the readability score to a mean grade

In [4]:
def map_fk_score_to_grade(score):
    if score >= 90:
        return '5th grade'
    elif score >= 80:
        return '6th grade'
    elif score >= 70:
        return '7th grade'
    elif score > 65:
        return '8th grade'  
    elif score > 60:
        return '9th grade'  
    elif score > 55:
        return '10th grade'  
    elif score > 50:
        return '11th grade'  
    elif score > 30:
        return '12th grade'  
    elif score > 20:
        return 'College undergraduate' 
    elif score > 10:
        return 'College graduate'  
    else:
        return 'Professional'  


df['Original Text School Level'] = df['Original Text Flesch Kincaid Readability Score'].apply(map_fk_score_to_grade)
df['Leveled Text School Level'] = df['Leveled Text Flesch Kincaid Readability Score'].apply(map_fk_score_to_grade)

display(df.head())

Unnamed: 0,Type of Passage,Original Grade,Leveled Grade,Topic,Original Text,Leveled Text,Original Word Count,Original Sentence Structure,Original Vocabulary,Original Content,Leveled Word Count,Leveled Sentence Structure,Leveled Vocabulary,Leveled Content,Original Text Flesch Kincaid Readability Score,Leveled Text Flesch Kincaid Readability Score,Original Text School Level,Leveled Text School Level
0,Fiction,5th,7th,A Lost Treasure,"Unfortunately, without the text of ""A Lost Tre...","**Hypothetical Example: ""A Lost Treasure""**\n\...",250-300 words,Greater variation in sentence structure,Introduction to figurative language,Expanding on character development and introdu...,375-425 words,Rich variation with emphasis on rhetorical dev...,Advanced vocabulary with more sophisticated la...,Exploration of complex themes and multiple sto...,62.78,46.67,9th grade,12th grade
1,Fiction,5th,10th,Time Travel Adventure,"Unfortunately, I can't assist with rewriting o...",### Time Travel Adventure\n\nIn the shadowed c...,250-300 words,Greater variation in sentence structure,Introduction to figurative language,Expanding on character development and introdu...,575-650 words,Emphasis on cohesion and sophisticated structures,Advanced and nuanced vocabulary,Intensive exploration of complex literary them...,72.76,45.46,7th grade,12th grade
2,Fiction,5th,6th,A Haunted House Mystery,"To proceed with your request, I'll need the te...","To proceed with the task you've requested, I'l...",250-300 words,Greater variation in sentence structure,Introduction to figurative language,Expanding on character development and introdu...,300-375 words,Balanced mix of sentence structures,Introduction to more abstract and nuanced voca...,Introduction to themes and symbolism in stories,49.65,36.32,12th grade,12th grade
3,Fiction,5th,9th,An Epic Battle Between Kingdoms,**An Epic Battle Between Kingdoms**\n\nA long ...,**An Epic Battle Between Kingdoms**\n\nIn an e...,250-300 words,Greater variation in sentence structure,Introduction to figurative language,Expanding on character development and introdu...,500-575 words,Advanced and varied sentence structures,Rich and sophisticated vocabulary,Exploration of literary techniques and deeper ...,75.2,53.1,7th grade,11th grade
4,Fiction,5th,5th,Survival on an Alien Planet,**Survival on an Alien Planet**\n\nWhen Jamie ...,**Survival on an Alien Planet**\n\nJamie and h...,250-300 words,Greater variation in sentence structure,Introduction to figurative language,Expanding on character development and introdu...,250-300 words,Greater variation in sentence structure,Introduction to figurative language,Expanding on character development and introdu...,84.68,76.01,6th grade,7th grade


# Converting the readability score to a mean grade

In [7]:
def map_fk_score_to_grade(score):
    if score >= 90:
        return '5th grade'
    elif score >= 80:
        return '6th grade'
    elif score >= 70:
        return '7th grade'
    elif score > 65:
        return '8th grade'  
    elif score > 60:
        return '9th grade'  
    elif score > 55:
        return '10th grade'  
    elif score > 50:
        return '11th grade'  
    elif score > 30:
        return '12th grade'  
    elif score > 20:
        return 'College undergraduate' 
    elif score > 10:
        return 'College graduate'  
    else:
        return 'Professional'  


df['Original Text School Level'] = df['Original Text Flesch Kincaid Readability Score'].apply(map_fk_score_to_grade)
df['Leveled Text School Level'] = df['Leveled Text Flesch Kincaid Readability Score'].apply(map_fk_score_to_grade)

display(df.head())

Unnamed: 0,Type of Passage,Original Grade,Leveled Grade,Topic,Original Text,Leveled Text,Original Word Count,Original Sentence Structure,Original Vocabulary,Original Content,Leveled Word Count,Leveled Sentence Structure,Leveled Vocabulary,Leveled Content,Original Text Flesch Kincaid Readability Score,Leveled Text Flesch Kincaid Readability Score,Original Text School Level,Leveled Text School Level
0,Fiction,5th,7th,A Lost Treasure,"Unfortunately, without the text of ""A Lost Tre...","**Hypothetical Example: ""A Lost Treasure""**\n\...",250-300 words,Greater variation in sentence structure,Introduction to figurative language,Expanding on character development and introdu...,375-425 words,Rich variation with emphasis on rhetorical dev...,Advanced vocabulary with more sophisticated la...,Exploration of complex themes and multiple sto...,62.78,46.67,9th grade,12th grade
1,Fiction,5th,10th,Time Travel Adventure,"Unfortunately, I can't assist with rewriting o...",### Time Travel Adventure\n\nIn the shadowed c...,250-300 words,Greater variation in sentence structure,Introduction to figurative language,Expanding on character development and introdu...,575-650 words,Emphasis on cohesion and sophisticated structures,Advanced and nuanced vocabulary,Intensive exploration of complex literary them...,72.76,45.46,7th grade,12th grade
2,Fiction,5th,6th,A Haunted House Mystery,"To proceed with your request, I'll need the te...","To proceed with the task you've requested, I'l...",250-300 words,Greater variation in sentence structure,Introduction to figurative language,Expanding on character development and introdu...,300-375 words,Balanced mix of sentence structures,Introduction to more abstract and nuanced voca...,Introduction to themes and symbolism in stories,49.65,36.32,12th grade,12th grade
3,Fiction,5th,9th,An Epic Battle Between Kingdoms,**An Epic Battle Between Kingdoms**\n\nA long ...,**An Epic Battle Between Kingdoms**\n\nIn an e...,250-300 words,Greater variation in sentence structure,Introduction to figurative language,Expanding on character development and introdu...,500-575 words,Advanced and varied sentence structures,Rich and sophisticated vocabulary,Exploration of literary techniques and deeper ...,75.2,53.1,7th grade,11th grade
4,Fiction,5th,5th,Survival on an Alien Planet,**Survival on an Alien Planet**\n\nWhen Jamie ...,**Survival on an Alien Planet**\n\nJamie and h...,250-300 words,Greater variation in sentence structure,Introduction to figurative language,Expanding on character development and introdu...,250-300 words,Greater variation in sentence structure,Introduction to figurative language,Expanding on character development and introdu...,84.68,76.01,6th grade,7th grade


# Plotting Original Texts Score vs Leveled Texts score 