# Hugging Face Transformers Assignment Solutions

## 0. Create a New Environment

Command line code to execute in the Terminal (Mac) or Anaconda Prompt (PC):

#### 1. view, create and switch environments
```
conda env list
conda create --name nlp_transformers
conda env list
conda activate nlp_transformers
```

#### 2. install and view packages
```
conda install python jupyter notebook pandas numpy scikit-learn openpyxl transformers pytorch
conda list
```

## 1. Sentiment Analysis

1. Create a new _nlp_transformers_ environment
2. Launch Jupyter Notebook
3. Read in the movie reviews data set including the VADER sentiment scores (_movie_reviews_sentiment.csv_)
4. Apply sentiment analysis to the _movie_info_ column using transformers
5. Compare the transformers sentiment scores with the VADER sentiment scores

In [1]:
# view movie reviews with vader sentiment scores
import pandas as pd

movies = pd.read_csv('../Data/movie_reviews_sentiment.csv')
movies.head(2)

Unnamed: 0,movie_title,rating,genre,in_theaters_date,movie_info,directors,director_gender,tomatometer_rating,audience_rating,critics_consensus,sentiment_vader
0,A Dog's Journey,PG,"Drama, Kids & Family",5/17/19,Bailey (voiced again by Josh Gad) is living th...,Gail Mancuso,female,50,92,A Dog's Journey is as sentimental as one might...,0.9837
1,A Dog's Way Home,PG,Drama,1/11/19,"Separated from her owner, a dog sets off on an...",Charles Martin Smith,male,60,71,A Dog's Way Home may not quite be a family-fri...,0.9237


In [2]:
%%time
# GPU approach (if available)

from transformers import pipeline, logging

logging.set_verbosity_error()

sentiment_analyzer = pipeline(
    "sentiment-analysis",
    model="distilbert/distilbert-base-uncased-finetuned-sst-2-english",
    device='mps' # mps for newer macs, cuda for gaming pcs
)

sentiment_scores = movies.movie_info.apply(sentiment_analyzer)
sentiment_scores

CPU times: user 4.73 s, sys: 809 ms, total: 5.54 s
Wall time: 7.38 s


0      [{'label': 'POSITIVE', 'score': 0.998246908187...
1      [{'label': 'POSITIVE', 'score': 0.999533653259...
2      [{'label': 'POSITIVE', 'score': 0.999443471431...
3      [{'label': 'POSITIVE', 'score': 0.999460160732...
4      [{'label': 'POSITIVE', 'score': 0.997202277183...
                             ...                        
161    [{'label': 'POSITIVE', 'score': 0.998772561550...
162    [{'label': 'POSITIVE', 'score': 0.998496770858...
163    [{'label': 'POSITIVE', 'score': 0.998909831047...
164    [{'label': 'POSITIVE', 'score': 0.991357326507...
165    [{'label': 'NEGATIVE', 'score': 0.998446881771...
Name: movie_info, Length: 166, dtype: object

In [3]:
%%time
# potential CPU approach (if GPU unavailable) -- review pro tip: speed up transformers code lesson for more options

from transformers import pipeline, logging

logging.set_verbosity_error()

sentiment_analyzer = pipeline(
    "sentiment-analysis",
    model="distilbert/distilbert-base-uncased-finetuned-sst-2-english",
    device=-1 # cpu
)

import torch
torch.set_num_threads(1)

sentiment_scores = movies.movie_info.apply(sentiment_analyzer)
sentiment_scores

CPU times: user 2min 11s, sys: 4.69 s, total: 2min 16s
Wall time: 19.4 s


0      [{'label': 'POSITIVE', 'score': 0.998246908187...
1      [{'label': 'POSITIVE', 'score': 0.999533653259...
2      [{'label': 'POSITIVE', 'score': 0.999443471431...
3      [{'label': 'POSITIVE', 'score': 0.999460160732...
4      [{'label': 'POSITIVE', 'score': 0.997202277183...
                             ...                        
161    [{'label': 'POSITIVE', 'score': 0.998772561550...
162    [{'label': 'POSITIVE', 'score': 0.998496770858...
163    [{'label': 'POSITIVE', 'score': 0.998909831047...
164    [{'label': 'POSITIVE', 'score': 0.991357326507...
165    [{'label': 'NEGATIVE', 'score': 0.998446881771...
Name: movie_info, Length: 166, dtype: object

In [4]:
# extract the label and score and create a sentiment score
pd.set_option('display.max_colwidth', None)

movies['label_hf'] = sentiment_scores.apply(lambda x: x[0]['label'])
movies['score_hf'] = sentiment_scores.apply(lambda x: x[0]['score'])
movies['sentiment_hf'] = movies.apply(lambda row: row['score_hf'] if row['label_hf'] == 'POSITIVE' else -row['score_hf'], axis=1)

In [5]:
# compare sentiment scores
movies[['movie_title', 'movie_info', 'sentiment_vader', 'sentiment_hf']].sort_values('sentiment_hf').head()

Unnamed: 0,movie_title,movie_info,sentiment_vader,sentiment_hf
22,Braid,"Two wanted women decide to rob their wealthy yet mentally unstable friend who lives in a fantasy world they all created as children. To take her money, the girls must take part in a deadly and perverse game of make believe throughout a sprawling yet decaying estate. As things become increasingly violent and hallucinatory, they realize that obtaining the money may be the least of their concerns.",-0.8316,-0.999203
103,Spider-Man: Far From Home,"Peter Parker returns in Spider-Man: Far From Home, the next chapter of the Spider-Man: Homecoming series! Our friendly neighborhood Super Hero decides to join his best friends Ned, MJ, and the rest of the gang on a European vacation. However, Peter's plan to leave super heroics behind for a few weeks are quickly scrapped when he begrudgingly agrees to help Nick Fury uncover the mystery of several elemental creature attacks, creating havoc across the continent!",0.9722,-0.998805
34,Dragged Across Concrete,"DRAGGED ACROSS CONCRETE follows two police detectives who find themselves suspended when a video of their strong-arm tactics is leaked to the media. With little money and no options, the embittered policemen descend into the criminal underworld and find more than they wanted waiting in the shadows.",-0.9015,-0.998734
165,Yesterday,"Jack Malik (Himesh Patel, BBC's Eastenders) is a struggling singer-songwriter in a tiny English seaside town whose dreams of fame are rapidly fading, despite the fierce devotion and support of his childhood best friend, Ellie (Lily James, Mamma Mia! Here We Go Again). Then, after a freak bus accident during a mysterious global blackout, Jack wakes up to discover that The Beatles have never existed... and he finds himself with a very complicated problem, indeed.",0.1365,-0.998447
102,Skin,"A white supremacist reforms his life after falling in love but saying goodbye to his skinhead life isn't a clean process. He must betray his former gang and work alongside the FBI in order to remove the body ink that has represented his identity for so long, as well as the burden of the gang's crimes he has carried.",-0.8377,-0.996846


In [6]:
# observations
# for rows 1, 3 & 5, both sentiment scores are similar
# for rows 2 & 4, the sentiment scores differ:
# - row 2 (spiderman): vader labeled it as very positive and hf's transformers as very negative - the actual text seems mixed
# - row 4 (yesterday): vader labeled it as neutral and hf's transformers as very negative - the actual text seems somewhat negative

## 2. Named Entity Recognition

1. Read in the children's books data set (_childrens_books.csv_)
2. Apply NER to the Description column
3. Create a list of all named entities
4. Only include the people (PER)
5. _Extra credit:_ Exclude the authors as well

In [7]:
# view the childrens books
books = pd.read_csv('../Data/childrens_books.csv')
books.head(2)

Unnamed: 0,Ranking,Title,Author,Year,Rating,Description
0,1,Where the Wild Things Are,Maurice Sendak,1963,4.25,"Where the Wild Things Are follows Max, a young boy who, after being sent to his room for misbehaving, imagines sailing to an island filled with wild creatures. As their king, Max tames the beasts and eventually returns home to find his supper waiting for him. This iconic book explores themes of imagination, adventure, and the complex emotions of childhood, all captured through Sendak's whimsical illustrations and story."
1,2,The Very Hungry Caterpillar,Eric Carle,1969,4.34,"The Very Hungry Caterpillar tells the story of a caterpillar who eats through a variety of foods before eventually becoming a butterfly. Eric Carle’s use of colorful collage illustrations and rhythmic text has made this book a beloved classic for young readers. The simple, engaging story introduces children to days of the week, counting, and the concept of metamorphosis. It’s a staple in early childhood education."


In [8]:
# find the named entities in each description
ner_analyzer = pipeline("ner",
                        model="dbmdz/bert-large-cased-finetuned-conll03-english",
                        device='mps',
                        aggregation_strategy='SIMPLE')

In [9]:
# apply to a single book
ner_analyzer(books.Description[0])

[{'entity_group': 'MISC',
  'score': np.float32(0.9462517),
  'word': 'Where the Wild Things Are',
  'start': 0,
  'end': 25},
 {'entity_group': 'PER',
  'score': np.float32(0.9990614),
  'word': 'Max',
  'start': 34,
  'end': 37},
 {'entity_group': 'PER',
  'score': np.float32(0.9984414),
  'word': 'Max',
  'start': 175,
  'end': 178},
 {'entity_group': 'PER',
  'score': np.float32(0.97894603),
  'word': 'Sendak',
  'start': 380,
  'end': 386}]

In [10]:
# extract the people only
[entity['word'] for entity in ner_analyzer(books.Description[0]) if entity['entity_group'] == 'PER']

['Max', 'Max', 'Sendak']

In [11]:
# apply to all books
named_entities = books['Description'].apply(lambda row: [entity['word']
                                                         for entity in ner_analyzer(row)
                                                         if entity['entity_group'] == 'PER'])

In [12]:
# create a unique list of named entities
named_entities = list(set(named_entities.explode().dropna().tolist()))
named_entities[:10]

['Jess Aarons',
 'Ramona',
 'Silverstein',
 'C',
 'Grover',
 'Arnold Lobel',
 'Sachar',
 'Clifford',
 '##per',
 'Meg Murry']

In [13]:
# view the number of entities
len(named_entities)

165

In [14]:
# create a unique list of authors
authors = list(set(books.Author.tolist()))
authors[:10]

['Doreen Cronin',
 'Janette Sebring Lowrey',
 'Margaret Wise Brown',
 'Beatrix Potter',
 'L.M. Montgomery',
 'Laura Joffe Numeroff',
 'Eric Carle',
 'J.K. Rowling',
 'E.L. Konigsburg',
 'C.S. Lewis']

In [15]:
# view the number of authors
len(authors)

72

In [16]:
# exclude subwords and authors from the named entity list to create a final list of characters
named_entities_clean = [entity for entity in named_entities if entity not in authors and '#' not in entity]
named_entities_clean[:10]

['Jess Aarons',
 'Ramona',
 'Silverstein',
 'C',
 'Grover',
 'Sachar',
 'Clifford',
 'Meg Murry',
 'Laura Ingalls',
 'Basil E. Frankweiler']

In [17]:
# view the number of final characters
len(named_entities_clean)

145

In [18]:
# comments
# 1. named_entities_clean is not fully clean - Galdone is an author / illustrator, not a character and slipped
#    through the cracks here, but this is common during text analysis since text data is rarely perfectly clean
# 2. the order of the entities in the solution in this notebook is different than the order of the entities in
#    the video solution because we used 'set', which removes duplicates but does not preserve order, resulting
#    in potentially different output orders with each run

## 3. Zero-Shot Classification

1. Apply zero-shot classification to the Description column using these five categories:
* adventure & fantasy
* animals & nature
* mystery
* humor
* non-fiction
2. Find the number of books in each category and check a few to see if the results make sense

In [19]:
# find the category of each description
classifier = pipeline("zero-shot-classification",
                      model="facebook/bart-large-mnli",
                      device='mps')

In [20]:
# apply to a single book
classifier(books.Description[0], ['adventure & fantasy', 'animals & nature', 'mystery', 'humor', 'non-fiction'])

{'sequence': "Where the Wild Things Are\xa0follows Max, a young boy who, after being sent to his room for misbehaving, imagines sailing to an island filled with wild creatures. As their king, Max tames the beasts and eventually returns home to find his supper waiting for him. This iconic book explores themes of imagination, adventure, and the complex emotions of childhood, all captured through Sendak's whimsical illustrations and story.",
 'labels': ['adventure & fantasy',
  'animals & nature',
  'humor',
  'mystery',
  'non-fiction'],
 'scores': [0.40190133452415466,
  0.30649277567863464,
  0.14998996257781982,
  0.07116316258907318,
  0.07045280188322067]}

In [21]:
# extract just the top category
classifier(books.Description[0], ['adventure & fantasy', 'animals & nature', 'mystery', 'humor', 'non-fiction'])['labels'][0]

'adventure & fantasy'

In [22]:
# apply category labels to the books
books['Category'] = (books['Description'].apply(lambda x:
                      classifier(x, candidate_labels=['adventure & fantasy', 'animals & nature', 'mystery',
                                                      'humor', 'non-fiction'])['labels'][0]))

In [23]:
# view the number of books in each category
books.Category.value_counts()

Category
humor                  42
adventure & fantasy    18
non-fiction            15
animals & nature       13
mystery                12
Name: count, dtype: int64

In [24]:
# humor books
books[['Title', 'Description', 'Category']][books.Category == 'humor'].head(2)

Unnamed: 0,Title,Description,Category
3,Green Eggs and Ham,"In Green Eggs and Ham, Sam-I-Am tries to convince a reluctant character to try a dish of green eggs and ham, despite his resistance. Through repetition and rhyme, Dr. Seuss’s classic story about being open to new experiences encourages children to be adventurous and try things outside their comfort zone. The playful illustrations and humorous dialogue make it a fun and educational read for young readers.",humor
5,Charlotte’s Web,"Charlotte’s Web tells the story of a pig named Wilbur and his friendship with Charlotte, a clever spider who saves his life. Set on a farm, the novel explores themes of friendship, loyalty, and the cycle of life. Through Charlotte’s wise words and actions, Wilbur learns about love and sacrifice. White’s writing is filled with warmth and humor, and it’s a perfect read for both children and adults, dealing with timeless themes.",humor


In [25]:
# adventure & fantasy books
books[['Title', 'Description', 'Category']][books.Category == 'adventure & fantasy'].head(2)

Unnamed: 0,Title,Description,Category
0,Where the Wild Things Are,"Where the Wild Things Are follows Max, a young boy who, after being sent to his room for misbehaving, imagines sailing to an island filled with wild creatures. As their king, Max tames the beasts and eventually returns home to find his supper waiting for him. This iconic book explores themes of imagination, adventure, and the complex emotions of childhood, all captured through Sendak's whimsical illustrations and story.",adventure & fantasy
12,Madeline,"Madeline follows the adventures of a brave and spirited little girl living in a convent in Paris. When she falls ill, she shows resilience and courage, never losing her sense of adventure. Bemelmans' charming illustrations and engaging storytelling make Madeline a beloved character in children’s literature. Themes of independence, courage, and friendship make this a timeless classic.",adventure & fantasy


In [26]:
# non-fiction books
books[['Title', 'Description', 'Category']][books.Category == 'non-fiction'].head(2)

Unnamed: 0,Title,Description,Category
2,The Giving Tree,"The Giving Tree is a touching and bittersweet story about a tree that gives everything it has to a boy over the course of his life. As the boy grows up, he takes more from the tree, and the tree continues to give, even when it has little left. Silverstein’s minimalist text and illustrations convey deep themes of unconditional love, selflessness, and the passage of time. It has sparked much discussion about relationships and sacrifice.",non-fiction
9,Love You Forever,"Love You Forever tells the heartwarming story of a mother’s enduring love for her son as he grows from a baby to an adult. Despite the passage of time and changing circumstances, the mother’s love remains constant. This book explores themes of unconditional love and the bond between parent and child, making it a perfect book for sharing at bedtime or during moments of reflection.",non-fiction


In [27]:
# animals & nature books
books[['Title', 'Description', 'Category']][books.Category == 'animals & nature'].head(2)

Unnamed: 0,Title,Description,Category
1,The Very Hungry Caterpillar,"The Very Hungry Caterpillar tells the story of a caterpillar who eats through a variety of foods before eventually becoming a butterfly. Eric Carle’s use of colorful collage illustrations and rhythmic text has made this book a beloved classic for young readers. The simple, engaging story introduces children to days of the week, counting, and the concept of metamorphosis. It’s a staple in early childhood education.",animals & nature
4,Goodnight Moon,"Goodnight Moon is a gentle, rhythmic bedtime story where a little bunny says goodnight to everything in his room, from the moon to the ""quiet old lady whispering hush."" Its repetitive structure and comforting tone make it ideal for young children. The simple illustrations by Clement Hurd complement the soothing nature of the story, making it a beloved classic for sleep-time reading.",animals & nature


In [28]:
# mystery books
books[['Title', 'Description', 'Category']][books.Category == 'mystery'].head(2)

Unnamed: 0,Title,Description,Category
13,"Harry Potter and the Sorcerer's Stone (Harry Potter, #1)","Harry Potter and the Sorcerer’s Stone introduces readers to Harry Potter, an orphan who discovers that he is a wizard and attends the magical Hogwarts School of Witchcraft and Wizardry. Along with his new friends, Harry uncovers mysteries surrounding his past and the dark wizard who killed his parents. This book starts the beloved series and sets the stage for Harry’s journey, filled with magic, adventure, and friendship.",mystery
20,The Little Prince,"The Little Prince is a philosophical novella that tells the story of a young prince who travels from planet to planet, learning about life, love, and human nature. Through encounters with various characters, he discovers that the most important things in life are invisible to the eye. The story’s allegorical messages about childhood innocence and adult perspective have made it a timeless classic, appealing to both children and adults alike.",mystery


In [29]:
# observations
# humor, adventure & fantasy, and animals & nature books all make a lot of sense
# non-fiction books make sense, but they're not truly non-fiction, so maybe rename as 'stories about growing up'
# mystery books make sense, and can maybe rename as 'books for older kids'

## 4. Text Summarization

1. Apply text summarization to the Description column
2. Review the results to see if they make sense

In [30]:
# summarize each book description
summarizer = pipeline("summarization",
                      model="facebook/bart-large-cnn",
                      device='mps')

In [31]:
# summarize a single book description
summarizer(books.Description[0])

[{'summary_text': "Where the Wild Things Are follows Max, a young boy who, after being sent to his room for misbehaving, imagines sailing to an island filled with wild creatures. This iconic book explores themes of imagination, adventure, and the complex emotions of childhood, all captured through Sendak's whimsical illustrations and story."}]

In [32]:
# tweak the parameters
summarizer(books.Description[0], min_length=10, max_length=50, early_stopping=True, length_penalty=.8)

[{'summary_text': 'Where the Wild Things Are follows Max, a young boy who imagines sailing to an island filled with wild creatures. This iconic book explores themes of imagination, adventure, and the complex emotions of childhood.'}]

In [33]:
# extract just the summary
summarizer(books.Description[0], min_length=10, max_length=50, early_stopping=True, length_penalty=.8)[0]['summary_text']

'Where the Wild Things Are follows Max, a young boy who imagines sailing to an island filled with wild creatures. This iconic book explores themes of imagination, adventure, and the complex emotions of childhood.'

In [34]:
# apply to all books
books['Summary'] = books.Description.apply(lambda row: summarizer(row,
                                           min_length=10, max_length=50,
                                           early_stopping=True, length_penalty=.8)[0]['summary_text'])

In [35]:
# view the summaries
books[['Title', 'Description', 'Summary']].head()

Unnamed: 0,Title,Description,Summary
0,Where the Wild Things Are,"Where the Wild Things Are follows Max, a young boy who, after being sent to his room for misbehaving, imagines sailing to an island filled with wild creatures. As their king, Max tames the beasts and eventually returns home to find his supper waiting for him. This iconic book explores themes of imagination, adventure, and the complex emotions of childhood, all captured through Sendak's whimsical illustrations and story.","Where the Wild Things Are follows Max, a young boy who imagines sailing to an island filled with wild creatures. This iconic book explores themes of imagination, adventure, and the complex emotions of childhood."
1,The Very Hungry Caterpillar,"The Very Hungry Caterpillar tells the story of a caterpillar who eats through a variety of foods before eventually becoming a butterfly. Eric Carle’s use of colorful collage illustrations and rhythmic text has made this book a beloved classic for young readers. The simple, engaging story introduces children to days of the week, counting, and the concept of metamorphosis. It’s a staple in early childhood education.","Eric Carle’s use of colorful collage illustrations and rhythmic text has made this book a beloved classic for young readers. The simple, engaging story introduces children to days of the week, counting, and the concept of met"
2,The Giving Tree,"The Giving Tree is a touching and bittersweet story about a tree that gives everything it has to a boy over the course of his life. As the boy grows up, he takes more from the tree, and the tree continues to give, even when it has little left. Silverstein’s minimalist text and illustrations convey deep themes of unconditional love, selflessness, and the passage of time. It has sparked much discussion about relationships and sacrifice.","Silverstein’s minimalist text and illustrations convey deep themes of unconditional love, selflessness, and the passage of time. It has sparked much discussion about relationships and sacrifice."
3,Green Eggs and Ham,"In Green Eggs and Ham, Sam-I-Am tries to convince a reluctant character to try a dish of green eggs and ham, despite his resistance. Through repetition and rhyme, Dr. Seuss’s classic story about being open to new experiences encourages children to be adventurous and try things outside their comfort zone. The playful illustrations and humorous dialogue make it a fun and educational read for young readers.",Dr. Seuss’s classic story encourages children to be adventurous and try things outside their comfort zone. The playful illustrations and humorous dialogue make it a fun and educational read for young readers.
4,Goodnight Moon,"Goodnight Moon is a gentle, rhythmic bedtime story where a little bunny says goodnight to everything in his room, from the moon to the ""quiet old lady whispering hush."" Its repetitive structure and comforting tone make it ideal for young children. The simple illustrations by Clement Hurd complement the soothing nature of the story, making it a beloved classic for sleep-time reading.","Goodnight Moon is a gentle, rhythmic bedtime story. The simple illustrations by Clement Hurd complement the soothing nature of the story."


In [36]:
# most of these summaries look great and make sense, although the second one had an incomplete, cutoff last sentence

## 5. Document Similarity

1. Turn the Description column into embeddings using feature extraction
2. Compare the cosine similarity of Harry Potter and the Sorcerer’s Stone compared to all other books
3. Return the top 5 most similar books

In [37]:
# specify our model
import numpy as np

feature_extractor = pipeline("feature-extraction",
                             model="sentence-transformers/all-MiniLM-L6-v2",
                             device='mps')

# extract the embeddings
embeddings = books['Description'].apply(lambda x: feature_extractor(x)[0][0])
embeddings_books = np.vstack(embeddings)
embeddings_books.shape

(100, 384)

In [38]:
# create a function to get book recommendations
from sklearn.metrics.pairwise import cosine_similarity

def get_similar_books(embeddings, book_index, book_details, top_n=3):
    
    # specify the book
    b_embedding = np.array(embeddings[book_index]).reshape(1, -1)
    
    # calculate cosine similarity scores
    similarity_scores = cosine_similarity(b_embedding, embeddings)
    similarity_scores_series = pd.Series(similarity_scores.flatten(), name='similarity_score')
    
    # combine with book info and their similarity scores
    similarity_df = pd.concat([book_details, similarity_scores_series], axis=1)
    
    # sort and return top n most similar books
    return similarity_df.sort_values('similarity_score', ascending=False).iloc[0:top_n+1]

In [39]:
# find the harry potter book index
books.Title[books.Title.str.contains('harry potter', case=False)]

13       Harry Potter and the Sorcerer's Stone (Harry Potter, #1)
97    Harry Potter and the Prisoner of Azkaban (Harry Potter, #3)
98     Harry Potter and the Chamber of Secrets (Harry Potter, #2)
Name: Title, dtype: object

In [40]:
# get book recommendations
get_similar_books(embeddings_books, 13, books[['Title', 'Description', 'Rating']], top_n=5)

Unnamed: 0,Title,Description,Rating,similarity_score
13,"Harry Potter and the Sorcerer's Stone (Harry Potter, #1)","Harry Potter and the Sorcerer’s Stone introduces readers to Harry Potter, an orphan who discovers that he is a wizard and attends the magical Hogwarts School of Witchcraft and Wizardry. Along with his new friends, Harry uncovers mysteries surrounding his past and the dark wizard who killed his parents. This book starts the beloved series and sets the stage for Harry’s journey, filled with magic, adventure, and friendship.",4.47,1.0
97,"Harry Potter and the Prisoner of Azkaban (Harry Potter, #3)","Harry Potter and the Prisoner of Azkaban is the third book in the Harry Potter series, where Harry returns to Hogwarts for his third year and uncovers secrets about his past. With the arrival of the mysterious Sirius Black, Harry must navigate dark truths and face his fears. This thrilling installment explores themes of loyalty, friendship, and identity, marking a turning point in the magical world of Harry Potter.",4.58,0.872638
98,"Harry Potter and the Chamber of Secrets (Harry Potter, #2)","Harry Potter and the Chamber of Secrets is the second book in the Harry Potter series, where Harry returns to Hogwarts for his second year and uncovers a hidden chamber within the school. As mysterious events unfold, Harry and his friends Ron and Hermione uncover dark secrets about the school’s past. Themes of courage, friendship, and standing up for what’s right are explored in this gripping magical adventure.",4.43,0.855368
63,The Witches,"The Witches tells the story of a young boy and his grandmother who uncover a secret society of witches who despise children and plot to turn them all into mice. With the help of his grandmother, the boy must outwit the witches and save the children. The book is known for its dark humor, thrilling suspense, and memorable characters. Though it can be a bit scary, it is beloved for its unique blend of fear, adventure, and courage.",4.18,0.799051
55,"The Wonderful Wizard of Oz (Oz, #1)","The Wonderful Wizard of Oz is the first book in Baum's Oz series and tells the story of Dorothy, a young girl from Kansas who is swept away to the magical land of Oz. Along with her new friends—the Scarecrow, Tin Man, and Cowardly Lion—Dorothy embarks on a journey to meet the Wizard and find her way home. The book is filled with themes of friendship, courage, and the belief in oneself, and has become an iconic tale in American literature.",4.0,0.788523
42,"The Hobbit, or There and Back Again (The Lord of the Rings, #0)","The Hobbit follows the journey of Bilbo Baggins, a hobbit who is thrust into an epic adventure with dwarves and the wizard Gandalf. Together, they set out to reclaim treasure guarded by the dragon Smaug. Along the way, Bilbo grows in courage, wisdom, and leadership. The book introduces readers to Tolkien's richly imagined world of Middle-earth and is the prelude to the Lord of the Rings trilogy, blending adventure, fantasy, and heroism.",4.29,0.773417


In [41]:
# these outputs are all about wizards and witches, which make a lot of sense!