---
title: Lab 3 Singing a Song
author: Marvin (Wenxiang) Li
format:
    html:
        toc: true
        code-fold: true
embed-resources: true        
---

### Import and Clean Data

In [26]:
import numpy as np
import pandas as pd
xmas = pd.read_csv("https://www.dropbox.com/scl/fi/qxaslqqp5p08i1650rpc4/xmas.csv?rlkey=erdxi7jbh7pqf9fh4lv4cayp5&dl=1")

- Rename Day.in.Words

In [27]:
num_to_word = {
    1: "One",
    2: "Two",
    3: "Three",
    4: "Four",
    5: "Five",
    6: "Six",
    7: "Seven",
    8: "Eight",
    9: "Nine",
    10: "Ten",
    11: "Eleven",
    12: "Twelve"
}

In [28]:
xmas["Day.in.Words"] = xmas["Day"].map(num_to_word)

In [29]:
xmas.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 12 entries, 0 to 11
Data columns (total 6 columns):
 #   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   Day           12 non-null     int64 
 1   Day.in.Words  12 non-null     object
 2   Gift.Item     12 non-null     object
 3   Verb          7 non-null      object
 4   Adjective     4 non-null      object
 5   Location      1 non-null      object
dtypes: int64(1), object(5)
memory usage: 708.0+ bytes


In [30]:
xmas.head(12)

Unnamed: 0,Day,Day.in.Words,Gift.Item,Verb,Adjective,Location
0,1,One,partridge,,,in a pear tree
1,2,Two,dove,,turtle,
2,3,Three,hen,,french,
3,4,Four,bird,,calling,
4,5,Five,ring,,golden,
5,6,Six,goose,a-laying,,
6,7,Seven,swan,a-swimming,,
7,8,Eight,maid,a-milking,,
8,9,Nine,lady,dancing,,
9,10,Ten,lord,a-leaping,,


### Function 1: pluralize_gift() Vertorized Version

- using boolean masking to make it work with vectors. 

In [31]:
def pluralize_gift(gift):
  regular = ~(gift.str.contains("oo")) & ~(gift.str.endswith("y"))
  oo_words = gift.str.contains("oo")
  y_words = gift.str.endswith("y")
  gift[regular] = gift[regular] + 's'
  gift[oo_words] = gift[oo_words].str.replace("oo", "ee")
  gift[y_words] = gift[y_words].str.replace("y", "ies")

  return gift

- test dataset

In [32]:
test_gift = pd.DataFrame(['goose','cat','lady'])
test_gift = test_gift.rename(columns = {0:"gift"})
print(pluralize_gift(test_gift['gift']))


0     geese
1      cats
2    ladies
Name: gift, dtype: object


- test with xmas

In [33]:
print(pluralize_gift(xmas['Gift.Item']))

0     partridges
1          doves
2           hens
3          birds
4          rings
5          geese
6          swans
7          maids
8         ladies
9          lords
10        pipers
11      drummers
Name: Gift.Item, dtype: object


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  gift[regular] = gift[regular] + 's'
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  gift[oo_words] = gift[oo_words].str.replace("oo", "ee")
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  gift[y_words] = gift[y_words].str.replace("y", "ies")


### Function 2: make_phrase()

In [34]:

def make_phrase(day, num_word, gift, verb, adjective, location):
    """
    Constructs a phrase based on the given parameters: day, num_word, gift, verb, adjective, and location.
    All inputs can be single values or pandas Series.
    """
    # Convert all inputs to pandas Series
    day = pd.Series(day)
    num_word = pd.Series(num_word)
    gift = pd.Series(gift)
    verb = pd.Series(verb)  
    adjective = pd.Series(adjective)
    location = pd.Series(location)

    ## Step 1: Replace NAs with blank strings
    verb = verb.fillna("")   # .fillna works only with vector or dataframe
    adjective = adjective.fillna("")
    location = location.fillna("")

    # Step 2: Pluralize gift items where day > 1
    pluralize_mask = day > 1
    gift[pluralize_mask] = pluralize_gift(gift[pluralize_mask])

    # Step 3: Check if the gift item starts with a vowel
    vowel_start = gift.str.startswith(('A', 'E', 'I', 'O', 'U', 'a', 'e', 'i', 'o', 'u'))
    # [0] indicate the first character of the string

    # Step 4: Create the article based on day and vowel start
    article = np.where(day == 1, np.where(vowel_start, "An", "A"), num_word)

    # Step 5: Construct the phrase
    phrase = article + " " + gift + " " + verb
    phrase = phrase.str.strip()  # Remove any extra whitespace

    # Add adjective and location if they are not empty
    phrase += np.where(adjective.str.len() > 0, " " + adjective, "")
    phrase += np.where(location.str.len() > 0, " " + location, "")

    return phrase.str.strip()  # Return the phrase without extra spaces

- Apply the make_phrase function

In [35]:
xmas['Full.Phrase'] = xmas.apply(
    lambda row: make_phrase(
        row['Day'],
        row['Day.in.Words'],
        row['Gift.Item'],
        row['Verb'],
        row['Adjective'],
        row['Location']
    ),
    axis=1
)

# Print the DataFrame to see the results
print(xmas[['Full.Phrase'][0]])

0     A partridges in a pear tree
1               Two dovess turtle
2              Three henss french
3             Four birdss calling
4              Five ringss golden
5             Six geeses a-laying
6         Seven swanss a-swimming
7          Eight maidss a-milking
8            Nine ladiess dancing
9            Ten lordss a-leaping
10          Eleven piperss piping
11      Twelve drummerss drumming
Name: Full.Phrase, dtype: object


### Function 3: sing_day()

- check index

In [36]:
print(xmas['Full.Phrase'][0])
print(xmas['Full.Phrase'][1])   

A partridges in a pear tree
Two dovess turtle


- A inner loop is the key to make the song work. Nth attemp of the outer loop = N times attemps in the inner loop

In [37]:
def sing_day(df, day_number, Full_Phrase):
  Full_Phrase = df[Full_Phrase]
  # dictonary for day_number
  num_word = {1: "first", 2: "second", 3: "third",
    4: "fourth", 5: "fifth", 6: "sixth",
    7: "seventh", 8: "eighth", 9: "nineth",
    10: "tenth", 11: "eleventh", 12: "twelfth"
    }
  
  song = ""
  for i in range(day_number,0,-1):
    intro = "On the " + num_word[i] + " day of Christmas, my true love sent to me:" + "\n" 
    song = song + intro

    for j in range(i, 0, -1): 
      if i > 1 and j == 1:
        song += "and " + Full_Phrase[j - 1] + "."  + "\n"
      else:
        song += Full_Phrase[j - 1] + ","  + "\n"

    song += "\n" 
  return song.strip() 

- check function

In [39]:
print(sing_day(xmas, 5, 'Full.Phrase'))

On the fifth day of Christmas, my true love sent to me:
Five ringss golden,
Four birdss calling,
Three henss french,
Two dovess turtle,
and A partridges in a pear tree.

On the fourth day of Christmas, my true love sent to me:
Four birdss calling,
Three henss french,
Two dovess turtle,
and A partridges in a pear tree.

On the third day of Christmas, my true love sent to me:
Three henss french,
Two dovess turtle,
and A partridges in a pear tree.

On the second day of Christmas, my true love sent to me:
Two dovess turtle,
and A partridges in a pear tree.

On the first day of Christmas, my true love sent to me:
A partridges in a pear tree,


In [40]:
xmas2 = pd.read_csv("https://www.dropbox.com/scl/fi/p9x9k8xwuzs9rhp582vfy/xmas_2.csv?rlkey=kvc3j3lmyn4opcidsrhcmrof1&dl=1")

In [41]:
xmas2.head()

Unnamed: 0,Day,Day.in.Words,Gift.Item,Verb,Adjective,Location
0,1,first,email,,,from Cal Poly
1,2,second,point,,meal,
2,3,third,pen,,lost,
3,4,fourth,review,,course,
4,5,fifth,exam,,practice,


- Apply make_phrase() and song_day() to xmas2

In [42]:
xmas2["Day.in.Words"] = xmas2["Day"].map(num_to_word)

In [43]:
xmas2['Full.Phrase'] = xmas2.apply(
    lambda row: make_phrase(
        row['Day'],
        row['Day.in.Words'],
        row['Gift.Item'],
        row['Verb'],
        row['Adjective'],
        row['Location']
    ),
    axis=1
)

# Print the DataFrame to see the results
print(xmas2[['Full.Phrase']])

                   Full.Phrase
0       An email from Cal Poly
1              Two points meal
2              Three pens lost
3          Four reviews course
4          Five exams practice
5          Six graders grading
6      Seven seniors stressing
7         Eight moms a-calling
8         Nine parties bumping
9         Ten loads of laundry
10  Eleven friends goodbye-ing
11       Twelve hours sleeping


In [44]:
print(sing_day(xmas2, 12, 'Full.Phrase'))

On the twelfth day of Christmas, my true love sent to me:
Twelve hours sleeping,
Eleven friends goodbye-ing,
Ten loads of laundry,
Nine parties bumping,
Eight moms a-calling,
Seven seniors stressing,
Six graders grading,
Five exams practice,
Four reviews course,
Three pens lost,
Two points meal,
and An email from Cal Poly.

On the eleventh day of Christmas, my true love sent to me:
Eleven friends goodbye-ing,
Ten loads of laundry,
Nine parties bumping,
Eight moms a-calling,
Seven seniors stressing,
Six graders grading,
Five exams practice,
Four reviews course,
Three pens lost,
Two points meal,
and An email from Cal Poly.

On the tenth day of Christmas, my true love sent to me:
Ten loads of laundry,
Nine parties bumping,
Eight moms a-calling,
Seven seniors stressing,
Six graders grading,
Five exams practice,
Four reviews course,
Three pens lost,
Two points meal,
and An email from Cal Poly.

On the nineth day of Christmas, my true love sent to me:
Nine parties bumping,
Eight moms a-calli