# Parsing Text (aka Prepping Text Data)

## Exercises
The end result of this exercise should be a file named prepare.py that defines the requested functions.

In this exercise we will be defining some functions to prepare textual data. These functions should apply equally well to both the codeup blog articles and the news articles that were previously acquired.

## Imports

In [1]:
#standard imports
import pandas as pd
import numpy as np

# my imports
import acquire as a

#import Parsing Text
import unicodedata

#import regular expression operations
import re

#import natural language toolkit
import nltk

#import stopwords list
from nltk.corpus import stopwords

In [2]:
news_articles = a.get_news_articles()

In [3]:
blog_articles = a.get_blog_articles()

## 1. Define a function named basic_clean. It should take in a string and apply some basic text cleaning to it:

    • Lowercase everything
    • Normalize unicode characters
    • Replace anything that is not a letter, number, whitespace or a single quote.

In [4]:
# blog_articles[0]['content']

In [5]:
def basic_clean(string):
    """
    Lower Case:
    - setting all letters to a lowercase
    
    Encoding:
    - `unicodedata.normalize` removes any inconsistencies in unicode character encoding
    - `.encode` to convert the resulting string to the ASCII character set
    - `.decode` to turn the resulting bytes object back into a string
    
    Special characters:
    - remove anything that isn't a-z, a number, a single quote, or a whitespace
    """
    # lowercase text
    string = string.lower()
    
    # remove any accented characters and non-ASCII characters
    # normalizing
    # getting ride of anything not in ascii
    # turning back to a string
    string = unicodedata.normalize('NFKD', string).encode('ascii','ignore').decode('utf-8')
    
    # remove special characters
    #use re.sub to remove special characters
    bc_string = re.sub(r'[^a-z0-9\'\s]', '', string)
    
    return bc_string

In [6]:
bc_string = basic_clean(blog_articles[0]['content'])
bc_string

'may is traditionally known as asian american and pacific islander aapi heritage month this month we celebrate the history and contributions made possible by our aapi friends family and community we also examine our level of support and seek opportunities to better understand the aapi community  in an effort to address real concerns and experiences we sat down with arbeena thapa one of codeups financial aid and enrollment managers arbeena identifies as nepali american and desi arbeenas parents immigrated to texas in 1988 for better employment and educational opportunities arbeenas older sister was five when they made the move to the us arbeena was born later becoming the first in her family to be a us citizen at codeup we take our efforts at inclusivity very seriously after speaking with arbeena we were taught that the term aapi excludes desiamerican individuals hence we will now use the term asian pacific islander desi american apida here is how the rest of our conversation with arbee

## 2. Define a function named tokenize. It should take in a string and tokenize all the words in the string.

In [7]:
def tokenize(string):
    """
    Tokenization is the process of breaking something down
    into smaller, discrete units. These units are called tokens.
    
    It's common to tokenize the strings to break up words and punctutation
    left over into discrete units. 
    """  

    #create the tokenizer
    tokenize = nltk.tokenize.ToktokTokenizer()
    tok_string = tokenize.tokenize(string, return_str=True)
  
    return tok_string

In [8]:
tok_string = tokenize(blog_articles[0]['content'])
tok_string

'May is traditionally known as Asian American and Pacific Islander ( AAPI ) Heritage Month. This month we celebrate the history and contributions made possible by our AAPI friends , family , and community. We also examine our level of support and seek opportunities to better understand the AAPI community. In an effort to address real concerns and experiences , we sat down with Arbeena Thapa , one of Codeup ’ s Financial Aid and Enrollment Managers. Arbeena identifies as Nepali American and Desi. Arbeena ’ s parents immigrated to Texas in 1988 for better employment and educational opportunities. Arbeena ’ s older sister was five when they made the move to the US. Arbeena was born later , becoming the first in her family to be a US citizen. At Codeup we take our efforts at inclusivity very seriously. After speaking with Arbeena , we were taught that the term AAPI excludes Desi-American individuals. Hence , we will now use the term Asian Pacific Islander Desi American ( APIDA ) . Here is 

## 3. Define a function named stem. It should accept some text and return the text after applying stemming to all the words.

In [9]:
def stem(string):
    """
    Stemming:
    - **truncates** words to their "stem"
    - algorithmic rules (non lingustic)
    - example: "calls", "called", "calling" --> "call"
    - fast and efficient
    """   
    #create porter stemmer
    ps = nltk.porter.PorterStemmer()
    
    #use stemmer - apply stem to each word in our string
    ps.stem(string)
    
    # split all the words in the article
    string.split()
    stems = [ps.stem (word) for word in string.split()]
    
    #join words back together
    string_stemmed = ' '.join(stems)
    
    return string_stemmed

In [10]:
string_stemmed = stem(blog_articles[0]['content'])
string_stemmed

'may is tradit known as asian american and pacif island (aapi) heritag month. thi month we celebr the histori and contribut made possibl by our aapi friends, family, and community. we also examin our level of support and seek opportun to better understand the aapi community. in an effort to address real concern and experiences, we sat down with arbeena thapa, one of codeup’ financi aid and enrol managers. arbeena identifi as nepali american and desi. arbeena’ parent immigr to texa in 1988 for better employ and educ opportunities. arbeena’ older sister wa five when they made the move to the us. arbeena wa born later, becom the first in her famili to be a us citizen. at codeup we take our effort at inclus veri seriously. after speak with arbeena, we were taught that the term aapi exclud desi-american individuals. hence, we will now use the term asian pacif island desi american (apida). here is how the rest of our convers with arbeena went! how do you celebr or connect with your heritag a

## 4. Define a function named lemmatize. It should accept some text and return the text after applying lemmatization to each word.

In [11]:
def lemmatize(string):
    """
    Lemmatize:
        - **changes** words to their "root"
        - it can conjugate to the base word 
        - example: "mouse", "mice" --> "mouse"
        - slower than stemming
    """ 
    #create the lemmatizer   
    wnl = nltk.stem.WordNetLemmatizer()
    
    #use lemmatize - apply stem to each word in our string
    # wnl.lemmatize(article)
    lemma = [wnl.lemmatize(word) for word in string.split()]
    
    #join words back together
    string_lemma = ' '.join(lemma)
    
    return string_lemma

In [12]:
string_lemma = lemmatize(blog_articles[0]['content'])
string_lemma

'May is traditionally known a Asian American and Pacific Islander (AAPI) Heritage Month. This month we celebrate the history and contribution made possible by our AAPI friends, family, and community. We also examine our level of support and seek opportunity to better understand the AAPI community. In an effort to address real concern and experiences, we sat down with Arbeena Thapa, one of Codeup’s Financial Aid and Enrollment Managers. Arbeena identifies a Nepali American and Desi. Arbeena’s parent immigrated to Texas in 1988 for better employment and educational opportunities. Arbeena’s older sister wa five when they made the move to the US. Arbeena wa born later, becoming the first in her family to be a US citizen. At Codeup we take our effort at inclusivity very seriously. After speaking with Arbeena, we were taught that the term AAPI excludes Desi-American individuals. Hence, we will now use the term Asian Pacific Islander Desi American (APIDA). Here is how the rest of our conversa

## 5. Define a function named remove_stopwords. It should accept some text and return the text after removing all the stopwords.

In [13]:
def remove_stopwords(string, string_lemma):
    """
    Words which have little or no significance, especially when constructing
    meaningful features from text, are known as stopwords.
    - example: a, an, the, and like

    We will use a standard English language stopwords list from nltk
    """
    #save stopwords
    stopwords_ls = stopwords.words('english')
    
    #split words in lemmatized article
    words = string_lemma.split()
    
    #remove stopwords from list of words
    filtered = [word for word in words if word not in stopwords_ls]
    
    #join words back together
    rem_stopwords = ' '.join(filtered)
    
    return rem_stopwords

In [14]:
rem_stopwords = remove_stopwords(blog_articles[0]['content'], string_lemma)
rem_stopwords

'May traditionally known Asian American Pacific Islander (AAPI) Heritage Month. This month celebrate history contribution made possible AAPI friends, family, community. We also examine level support seek opportunity better understand AAPI community. In effort address real concern experiences, sat Arbeena Thapa, one Codeup’s Financial Aid Enrollment Managers. Arbeena identifies Nepali American Desi. Arbeena’s parent immigrated Texas 1988 better employment educational opportunities. Arbeena’s older sister wa five made move US. Arbeena wa born later, becoming first family US citizen. At Codeup take effort inclusivity seriously. After speaking Arbeena, taught term AAPI excludes Desi-American individuals. Hence, use term Asian Pacific Islander Desi American (APIDA). Here rest conversation Arbeena went! How celebrate connect heritage cultural traditions? “I celebrate Nepal’s version Christmas Dashain. This nine-day celebration also known Dussehra. I grew Hindu I identify Hindu, large part he

## This function should define two optional parameters, extra_words and exclude_words. These parameters should define any additional stop words to include, and any words that we don't want to remove.

In [15]:
#set a list to add some stopwords IF THEY ARE NEEDED!
extra_words = ['all', 'about','after']

In [16]:
#set a list to remove some stopwords IF THEY ARE NEEDED!
exclude_words = ['aaa']

In [17]:
def remove_stopwords_extra_words(string_lemma, extra_words, exclude_words):
    """
    Words which have little or no significance, especially when constructing
    meaningful features from text, are known as stopwords.
    - example: a, an, the, and like

    We will use a standard English language stopwords list from nltk
    """
    #save stopwords
    stopwords_ls = stopwords.words('english')
    
    # remove extra words
    stopwords_ls = set(stopwords_ls) - set(exclude_words)

    # add to stopword list
    stopwords_ls = set(stopwords_ls).union(extra_words)
    
    #split words in lemmatized article
    words = string_lemma.split()
    
    #remove stopwords from list of words
    filtered = [word for word in words if word not in stopwords_ls]
    
    #join words back together
    rem_stopwords = ' '.join(filtered)
    
    return rem_stopwords

In [18]:
string_lemma

'May is traditionally known a Asian American and Pacific Islander (AAPI) Heritage Month. This month we celebrate the history and contribution made possible by our AAPI friends, family, and community. We also examine our level of support and seek opportunity to better understand the AAPI community. In an effort to address real concern and experiences, we sat down with Arbeena Thapa, one of Codeup’s Financial Aid and Enrollment Managers. Arbeena identifies a Nepali American and Desi. Arbeena’s parent immigrated to Texas in 1988 for better employment and educational opportunities. Arbeena’s older sister wa five when they made the move to the US. Arbeena wa born later, becoming the first in her family to be a US citizen. At Codeup we take our effort at inclusivity very seriously. After speaking with Arbeena, we were taught that the term AAPI excludes Desi-American individuals. Hence, we will now use the term Asian Pacific Islander Desi American (APIDA). Here is how the rest of our conversa

In [19]:
rem_stopwords = remove_stopwords(string_lemma, extra_words, exclude_words)
rem_stopwords

'May traditionally known Asian American Pacific Islander (AAPI) Heritage Month. This month celebrate history contribution made possible AAPI friends, family, community. We also examine level support seek opportunity better understand AAPI community. In effort address real concern experiences, sat Arbeena Thapa, one Codeup’s Financial Aid Enrollment Managers. Arbeena identifies Nepali American Desi. Arbeena’s parent immigrated Texas 1988 better employment educational opportunities. Arbeena’s older sister wa five made move US. Arbeena wa born later, becoming first family US citizen. At Codeup take effort inclusivity seriously. After speaking Arbeena, taught term AAPI excludes Desi-American individuals. Hence, use term Asian Pacific Islander Desi American (APIDA). Here rest conversation Arbeena went! How celebrate connect heritage cultural traditions? “I celebrate Nepal’s version Christmas Dashain. This nine-day celebration also known Dussehra. I grew Hindu I identify Hindu, large part he

## 6. Use your data from the acquire to produce a dataframe of the news articles. Name the dataframe news_df.

In [20]:
news_articles = a.get_news_articles()
news_df = pd.DataFrame(news_articles)
news_df

Unnamed: 0,category,title,content
0,business,"Sensex, Nifty end at fresh closing highs","Benchmark indices Sensex and Nifty ended at record closing highs on Wednesday. Sensex ended 195 points higher at 63,523 while the Nifty ended at 18,856.85, up 40 points. The gains were led by stocks like HDFC, Reliance Industries and TCS. During the intraday trade, Sensex rose to its fresh record high level of 63,588."
1,business,Amazon tricked millions of customers into enrolling in Prime: US FTC,"US Federal Trade Commission (FTC) has sued Amazon, accusing it of tricking millions of consumers into signing up for its Prime subscription without their consent. ""Amazon used manipulative, coercive or deceptive user-interface designs known as 'dark patterns' to trick consumers into enrolling in automatically-renewing Prime subscriptions,"" US FTC said. Prime members in the US pay $139 per year."
2,business,TIME releases list of the world's 100 most influential companies,"TIME magazine has released its annual list of the world's 100 most influential companies, which features OpenAI, SpaceX, Chess.com, Google DeepMind and Kim Kardashian's SKIMS among others. The National Payments Corporation of India (NPCI) and e-commerce platform Meesho also featured on the list. ""NPCI launched UPI...which accounted for 52% of India's digital transactions in FY22,"" TIME said."
3,business,Which are the world's top 10 airlines according to passengers?,"Singapore Airlines is the world's best airline, according to Skytrax World Airline Awards 2023, an annual poll of flyers released at the Paris Air Show. It is followed by Qatar Airways, All Nippon Airways, Emirates, Japan Airlines, Turkish Airlines, Air France, Cathay Pacific, EVA Air, and Korean Air. Vistara, ranked 16th, is the only Indian airline in the top 20."
4,business,"Grab lays off over 1,000 employees","Singapore-based ride-hailing and food delivery app Grab has laid off over 1,000 employees. This is Grab's largest round of layoffs since 2020, when it cut 360 jobs in response to COVID-19 pandemic challenges. ""I want to be clear that we're not doing this as a shortcut to profitability,"" Group CEO and Co-Founder Anthony Tan said in an e-mail to employees."
...,...,...,...
95,entertainment,"Asked Alia 'What is it that H'wood has', she said money: Mahesh","Filmmaker Mahesh Bhatt said he once asked Alia Bhatt that what does Hollywood has that Indian cinema doesn't. ""Her straight reply was 'Money'. She said it with great humility. She also said that they've a way of doing things...they're very professional,"" he recalled. Commenting on Alia's Hollywood debut with 'Heart of Stone', Mahesh said, ""My heart soars with pride."""
96,entertainment,"Two of my most favourite people: Kangana on Elon Musk, PM Modi","Actress Kangana Ranaut reacted to Elon Musk's latest statement that he's a fan of Prime Minister Narendra Modi. Sharing a picture of Elon Musk and PM Modi on Instagram, Kangana wrote, ""Two of my most favourite people. Such a lovely morning."" On the work front, Kangana is busy in promotions of 'Tiku Weds Sheru', which marks her maiden production venture."
97,entertainment,"Thought kids at school would laugh at me, said no: Jugal on Masoom","Jugal Hansraj said that he had initially rejected Shekhar Kapur's 'Masoom'. ""When I heard the story, I thought this kid cries a lot...Everyone in my school would laugh at me...call me a crybaby...I said no,"" Hansraj shared. He added that the filmmaker insisted that he wanted Jugal in the film. ""I actually had...lot of fun,"" Hansraj said about shooting 'Masoom'."
98,entertainment,"Richard Gere attends Yoga event led by PM Modi at UN HQ, video out","Actor Richard Gere was in attendance at the Yoga event led by PM Narendra Modi on the occasion of International Yoga Day on June 21 at the United Nations Headquarters in New York. ""It is a very nice feeling here today, so open and embracing,"" Gere told reporters. Gere was among people of 135 nationalities who attended the event."


## 7. Make another dataframe for the Codeup blog posts. Name the dataframe codeup_df.

In [21]:
blog_articles = a.get_blog_articles()
codeup_df = pd.DataFrame(blog_articles)
codeup_df

Unnamed: 0,title,content
0,Spotlight on APIDA Voices: Celebrating Heritage and Inspiring Change ft. Arbeena Thapa,"May is traditionally known as Asian American and Pacific Islander (AAPI) Heritage Month. This month we celebrate the history and contributions made possible by our AAPI friends, family, and community. We also examine our level of support and seek opportunities to better understand the AAPI community. In an effort to address real concerns and experiences, we sat down with Arbeena Thapa, one of Codeup’s Financial Aid and Enrollment Managers. Arbeena identifies as Nepali American and Desi. Arbeena’s parents immigrated to Texas in 1988 for better employment and educational opportunities. Arbeena’s older sister was five when they made the move to the US. Arbeena was born later, becoming the first in her family to be a US citizen. At Codeup we take our efforts at inclusivity very seriously. After speaking with Arbeena, we were taught that the term AAPI excludes Desi-American individuals. Hence, we will now use the term Asian Pacific Islander Desi American (APIDA). Here is how the rest of our conversation with Arbeena went! How do you celebrate or connect with your heritage and cultural traditions? “I celebrate Nepal’s version of Christmas or Dashain. This is a nine-day celebration also known as Dussehra. I grew up as Hindu and I identify as Hindu, this is a very large part of my heritage. “ “Other ways I connect with my culture include sharing food! Momos are South Asian Dumplings and they’re my favorite to make and share.” “On my Asian American side, I am an advocate of immigrant justice and erasure within APIDA social or political movements. I participate in events to embrace my identity such as immigrant justice advocacy because I come from a mixed-status family. I’ve always been in a community with undocumented Asian immigrants. .” What are some of the challenges you have faced as an APIDA individual, personally or professionally? “I often struggle with being gendered as compliant or a pushover. Professionally, I am often stereotyped as meek, so I’ve been overlooked for leadership roles. We are seen as perpetually foreign; people tend to other us in that way, yet put us on a pedestal for what a model minority looks like. This has made me hesitant to share my heritage in the past because these assumptions get mapped onto me. ” Can you describe some common barriers of entry that APIDA individuals, specifically women may face when trying to enter or advance in the workplace? “Being overlooked for leadership. In the past, I have not been viewed as a leader. People sometimes have preconceived stereotypes of Asian women not being able to be bold, or being vocal can be mistaken for being too emotional. “ How do you believe microaggressions impact APIDA individuals in the workplace? Can you provide examples of such microaggressions? “Erasure is big. To me, only saying ‘Merry Christmas’ isn’t inclusive to other religions. People are often resistant to saying ‘Happy Holidays,’ but saying Merry Christmas excludes, and does not appreciate my heritage. “ “Often microaggressions are not micro at all. They typically are not aggressive racialized violence, but the term ‘micro’ minimizes impact.” “Some that I’ve heard are ‘What kind of Asian are you?’ or ‘Where are you from?’ This automatically makes me the ‘other’ and not seen as American. Even within the APIDA community, South Asians are overlooked as “Asian”.” How important is representation, specifically APIDA representation, in organizational leadership positions? “I want to say that it is important to have someone who looks like you in leadership roles, and it is, but those leaders may not share the same beliefs as you. Certain privileges such as wealth, resources, or lack of interaction with lower-socioeconomic-status Asian Americans may cause a difference in community politics. I do not think the bamboo ceiling is acceptable, but the company you work for plays a big part in your politics and belief alignment.” How do you feel about code-switching, and have you ever felt it necessary to code-switch? “I like sharing South Asian terms or connecting with others that have similar heritage and culture. A workplace that is welcoming to going into this sort of breakout is refreshing and makes space for us. However, having to code-switch could also mean a workplace that is not conducive and welcoming of other cultures. “ Finally, in your opinion, what long-term strategies can create lasting change in the workplace and ensure support, equality, and inclusion for APIDA individuals? “Prior to a career in financial aid, I did a lot of research related to the post-9/11 immigration of the South Asian diaspora. This background made me heavily rely on grassroots organizing. Hire the people that want to innovate, hire the changemakers, hire the button-pushers. Reduce reliance on whiteness as change. This will become natural for the organization and become organizational change. Change comes from us on the ground.” A huge thank you to Arbeena Thapa for sharing her experiences, and being vulnerable with us. Your words were inspiring and the opportunity to understand your perspective more has been valuable. We hope we can become better support for the APIDA community as we learn and grow on our journey of cultivating inclusive growth."
1,Women in tech: Panelist Spotlight – Magdalena Rahn,"Women in tech: Panelist Spotlight – Magdalena Rahn Codeup is hosting a Women in Tech Panel in honor of Women’s History Month on March 29th, 2023! To further celebrate, we’d like to spotlight each of our panelists leading up to the discussion to learn a bit about their respective experiences as women in the tech industry! Meet Magdalena! Magdalena Rahn is a current Codeup student in a Data Science cohort in San Antonio, Texas. She has a professional background in cross-cultural communications, international business development, the wine industry and journalism. After serving in the US Navy, she decided to complement her professional skill set by attending the Data Science program at Codeup; she is set to graduate in March 2023. Magdalena is fluent in French, Bulgarian, Chinese-Mandarin, Spanish and Italian. We asked Magdalena how Codeup impacted her career, and she replied “Codeup has provided a solid foundation in analytical processes, programming and data science methods, and it’s been an encouragement to have such supportive instructors and wonderful classmates.” Don’t forget to tune in on March 29th to sit in on an insightful conversation with Magdalena."
2,Women in tech: Panelist Spotlight – Rachel Robbins-Mayhill,"Women in tech: Panelist Spotlight – Rachel Robbins-Mayhill Codeup is hosting a Women in Tech Panel in honor of Women’s History Month on March 29th, 2023! To further celebrate, we’d like to spotlight each of our panelists leading up to the discussion to learn a bit about their respective experiences as women in the tech industry! Meet Rachel! Rachel Robbins-Mayhill is a Decision Science Analyst I in San Antonio, Texas. Rachel has had a varied career that includes counseling, teaching, training, community development, and military operations. Her focus has always been on assessing needs, identifying solutions, and educating individuals and groups on aligning needs and solutions in different contexts. Rachel’s passion for data science stems from her belief that data is a powerful tool for communicating patterns that can lead to hope and growth in the future. In June 2022, Rachel graduated from Codeup’s Innis cohort, where she honed her skills in data science. Shortly after, she started working as a Data Science Technical Writer with Apex Systems as a Contractor for USAA in July 2022. Her unconventional role allowed her to understand where her skills could be best utilized to support USAA in a non-contract role. Rachel recently joined USAA’s Data Science Delivery team as a Decision Science Analyst I in February 2023. The team is focused on delivering machine learning models for fraud prevention, and Rachel’s particular role centers around providing strategic process solutions for the team in collaboration with Operational and Model Risk components. In addition to her career, Rachel is currently pursuing a master’s degree in Applied Data Science from Syracuse University, further expanding her knowledge and skills in the field. Rachel is passionate about collaborating with individuals who share her belief in the potential of others and strive to achieve growth through logical, informed action. She welcomes LinkedIn connections and is excited about supporting the network of CodeUp alumni! We asked Rachel how Codeup impacted her career, and she replied “Codeup delivered a comprehensive education in all facets of the data science pipeline, laying a strong foundation for me to build upon. Through repeated hands-on practice, I developed a reliable process that was immediately applicable in my job. Collaborative group projects were instrumental in helping me hone my skills in project management, allowing me to navigate complex data science projects with comfortability. Thanks to this invaluable experience, I was able to make significant strides in my career within just six months of graduating from Codeup.” Don’t forget to tune in on March 29th to sit in on an insightful conversation."
3,Women in Tech: Panelist Spotlight – Sarah Mellor,"Women in tech: Panelist Spotlight – Sarah Mellor Codeup is hosting a Women in Tech Panel in honor of Women’s History Month on March 29th, 2023! To further celebrate, we’d like to spotlight each of our panelists leading up to the discussion to learn a bit about their respective experiences as women in the tech industry! Meet Sarah! Sarah Mellor currently works as the Director of People Operations. She joined Codeup four and a half years ago as an Admissions Manager. She went on to build out and lead the Marketing and Admissions team, while picking up People Ops tasks and projects here and there until moving over to lead the People Ops team two years ago. Prior to Codeup, she worked at education-focused non-profits in Washington, DC and Boulder, Colorado. She graduated from Wake Forest University. We asked Sarah how Codeup has impacted her career, and her response was “I have absolutely loved having the privilege to grow alongside Codeup. In my time here across multiple different roles and departments, I’ve seen a lot of change. The consistent things have always been the high quality of passionate and hardworking people I get to work with; the impactful mission we get to work on; and the inspiring students who trust us with their career change.” Don’t forget to tune in on March 29th to sit in on an insightful conversation."
4,Women in Tech: Panelist Spotlight – Madeleine Capper,"Women in tech: Panelist Spotlight – Madeleine Capper Codeup is hosting a Women in Tech Panel in honor of Women’s History Month on March 29th, 2023! To further celebrate, we’d like to spotlight each of our panelists leading up to the discussion to learn a bit about their respective experiences as women in the tech industry! Meet Madeleine! Madeleine Capper is a Data Scientist in San Antonio, Texas. A long-standing San Antonio resident, she studied mathematics at the University of Texas San Antonio and has worked as a Data Scientist for Booz Allen Hamilton. Madeleine currently teaches Data Science at Codeup, where she works daily with burgeoning data professionals to help them actualize their career aspirations through technical education. Madeleine attended Codeup as a student in early 2019 as a pupil in the very first Codeup Data Science cohort. The program proved immediately effective and she was the first student to obtain a data career out of the program. After working at Booz Allen Hamilton, Madeleine’s passion for education in conjunction with her appreciation for Codeup’s capacity for transformative life change brought her back to the institution in an instructional capacity, where she has been teaching for two years. Don’t forget to tune in on March 29th to sit in on an insightful conversation."
5,Black Excellence in Tech: Panelist Spotlight – Wilmarie De La Cruz Mejia,"Black excellence in tech: Panelist Spotlight – Wilmarie De La Cruz Mejia Codeup is hosting a Black Excellence in Tech Panel in honor of Black History Month on February 22, 2023! To further celebrate, we’d like to spotlight each of our panelists leading up to the discussion to learn a bit about their respective experiences as black leaders in the tech industry! Meet Wilmarie! Wilmarie De La Cruz Mejia is a current Codeup student on the path to becoming a Full-Stack Web Developer at our Dallas, TX campus. Wilmarie is a veteran expanding her knowledge of programming languages and technologies on her journey with Codeup. We asked Wilmarie to share more about her experience at Codeup. She shares, “I was able to meet other people who were passionate about coding and be in a positive learning environment.” We hope you can join us on February 22nd to sit in on an insightful conversation with Wilmarie and all of our panelists!"


## 8. For each dataframe, produce the following columns:

. **title** to hold the title     
. **original** to hold the original article/post content    
. **clean** to hold the normalized and tokenized original with the stopwords removed.    
. **stemmed** to hold the stemmed version of the cleaned data.   
. **lemmatized** to hold the lemmatized version of the cleaned data.

## news_df DATAFRAME

In [22]:
# df.rename columns from content to original
news_df = news_df.rename(columns={'content':'original'})
news_df

Unnamed: 0,category,title,original
0,business,"Sensex, Nifty end at fresh closing highs","Benchmark indices Sensex and Nifty ended at record closing highs on Wednesday. Sensex ended 195 points higher at 63,523 while the Nifty ended at 18,856.85, up 40 points. The gains were led by stocks like HDFC, Reliance Industries and TCS. During the intraday trade, Sensex rose to its fresh record high level of 63,588."
1,business,Amazon tricked millions of customers into enrolling in Prime: US FTC,"US Federal Trade Commission (FTC) has sued Amazon, accusing it of tricking millions of consumers into signing up for its Prime subscription without their consent. ""Amazon used manipulative, coercive or deceptive user-interface designs known as 'dark patterns' to trick consumers into enrolling in automatically-renewing Prime subscriptions,"" US FTC said. Prime members in the US pay $139 per year."
2,business,TIME releases list of the world's 100 most influential companies,"TIME magazine has released its annual list of the world's 100 most influential companies, which features OpenAI, SpaceX, Chess.com, Google DeepMind and Kim Kardashian's SKIMS among others. The National Payments Corporation of India (NPCI) and e-commerce platform Meesho also featured on the list. ""NPCI launched UPI...which accounted for 52% of India's digital transactions in FY22,"" TIME said."
3,business,Which are the world's top 10 airlines according to passengers?,"Singapore Airlines is the world's best airline, according to Skytrax World Airline Awards 2023, an annual poll of flyers released at the Paris Air Show. It is followed by Qatar Airways, All Nippon Airways, Emirates, Japan Airlines, Turkish Airlines, Air France, Cathay Pacific, EVA Air, and Korean Air. Vistara, ranked 16th, is the only Indian airline in the top 20."
4,business,"Grab lays off over 1,000 employees","Singapore-based ride-hailing and food delivery app Grab has laid off over 1,000 employees. This is Grab's largest round of layoffs since 2020, when it cut 360 jobs in response to COVID-19 pandemic challenges. ""I want to be clear that we're not doing this as a shortcut to profitability,"" Group CEO and Co-Founder Anthony Tan said in an e-mail to employees."
...,...,...,...
95,entertainment,"Asked Alia 'What is it that H'wood has', she said money: Mahesh","Filmmaker Mahesh Bhatt said he once asked Alia Bhatt that what does Hollywood has that Indian cinema doesn't. ""Her straight reply was 'Money'. She said it with great humility. She also said that they've a way of doing things...they're very professional,"" he recalled. Commenting on Alia's Hollywood debut with 'Heart of Stone', Mahesh said, ""My heart soars with pride."""
96,entertainment,"Two of my most favourite people: Kangana on Elon Musk, PM Modi","Actress Kangana Ranaut reacted to Elon Musk's latest statement that he's a fan of Prime Minister Narendra Modi. Sharing a picture of Elon Musk and PM Modi on Instagram, Kangana wrote, ""Two of my most favourite people. Such a lovely morning."" On the work front, Kangana is busy in promotions of 'Tiku Weds Sheru', which marks her maiden production venture."
97,entertainment,"Thought kids at school would laugh at me, said no: Jugal on Masoom","Jugal Hansraj said that he had initially rejected Shekhar Kapur's 'Masoom'. ""When I heard the story, I thought this kid cries a lot...Everyone in my school would laugh at me...call me a crybaby...I said no,"" Hansraj shared. He added that the filmmaker insisted that he wanted Jugal in the film. ""I actually had...lot of fun,"" Hansraj said about shooting 'Masoom'."
98,entertainment,"Richard Gere attends Yoga event led by PM Modi at UN HQ, video out","Actor Richard Gere was in attendance at the Yoga event led by PM Narendra Modi on the occasion of International Yoga Day on June 21 at the United Nations Headquarters in New York. ""It is a very nice feeling here today, so open and embracing,"" Gere told reporters. Gere was among people of 135 nationalities who attended the event."


In [23]:
# combine basic and tokenize functions
basic_clean(tokenize(remove_stopwords(news_df.original[0], extra_words, exclude_words)))

'benchmark indices sensex nifty ended record closing highs wednesday sensex ended 195 points higher 63523 nifty ended 1885685  40 points the gains led stocks like hdfc  reliance industries tcs during intraday trade  sensex rose fresh record high level 63588 '

In [24]:
news_df['clean'] = news_df.original.apply(basic_clean).apply(tokenize).apply(remove_stopwords, extra_words=extra_words, exclude_words=exclude_words)
news_df.head()

Unnamed: 0,category,title,original,clean
0,business,"Sensex, Nifty end at fresh closing highs","Benchmark indices Sensex and Nifty ended at record closing highs on Wednesday. Sensex ended 195 points higher at 63,523 while the Nifty ended at 18,856.85, up 40 points. The gains were led by stocks like HDFC, Reliance Industries and TCS. During the intraday trade, Sensex rose to its fresh record high level of 63,588.",benchmark indices sensex nifty ended record closing highs wednesday sensex ended 195 points higher 63523 nifty ended 1885685 40 points gains led stocks like hdfc reliance industries tcs intraday trade sensex rose fresh record high level 63588
1,business,Amazon tricked millions of customers into enrolling in Prime: US FTC,"US Federal Trade Commission (FTC) has sued Amazon, accusing it of tricking millions of consumers into signing up for its Prime subscription without their consent. ""Amazon used manipulative, coercive or deceptive user-interface designs known as 'dark patterns' to trick consumers into enrolling in automatically-renewing Prime subscriptions,"" US FTC said. Prime members in the US pay $139 per year.",us federal trade commission ftc sued amazon accusing tricking millions consumers signing prime subscription without consent amazon used manipulative coercive deceptive userinterface designs known ' dark patterns ' trick consumers enrolling automaticallyrenewing prime subscriptions us ftc said prime members us pay 139 per year
2,business,TIME releases list of the world's 100 most influential companies,"TIME magazine has released its annual list of the world's 100 most influential companies, which features OpenAI, SpaceX, Chess.com, Google DeepMind and Kim Kardashian's SKIMS among others. The National Payments Corporation of India (NPCI) and e-commerce platform Meesho also featured on the list. ""NPCI launched UPI...which accounted for 52% of India's digital transactions in FY22,"" TIME said.",time magazine released annual list world ' 100 influential companies features openai spacex chesscom google deepmind kim kardashian ' skims among others national payments corporation india npci ecommerce platform meesho also featured list npci launched upiwhich accounted 52 india ' digital transactions fy22 time said
3,business,Which are the world's top 10 airlines according to passengers?,"Singapore Airlines is the world's best airline, according to Skytrax World Airline Awards 2023, an annual poll of flyers released at the Paris Air Show. It is followed by Qatar Airways, All Nippon Airways, Emirates, Japan Airlines, Turkish Airlines, Air France, Cathay Pacific, EVA Air, and Korean Air. Vistara, ranked 16th, is the only Indian airline in the top 20.",singapore airlines world ' best airline according skytrax world airline awards 2023 annual poll flyers released paris air show followed qatar airways nippon airways emirates japan airlines turkish airlines air france cathay pacific eva air korean air vistara ranked 16th indian airline top 20
4,business,"Grab lays off over 1,000 employees","Singapore-based ride-hailing and food delivery app Grab has laid off over 1,000 employees. This is Grab's largest round of layoffs since 2020, when it cut 360 jobs in response to COVID-19 pandemic challenges. ""I want to be clear that we're not doing this as a shortcut to profitability,"" Group CEO and Co-Founder Anthony Tan said in an e-mail to employees.",singaporebased ridehailing food delivery app grab laid 1000 employees grab ' largest round layoffs since 2020 cut 360 jobs response covid19 pandemic challenges want clear ' shortcut profitability group ceo cofounder anthony tan said email employees


In [25]:
stem(news_df.clean[0])

'benchmark indic sensex nifti end record close high wednesday sensex end 195 point higher 63523 nifti end 1885685 40 point gain led stock like hdfc relianc industri tc intraday trade sensex rose fresh record high level 63588'

In [26]:
news_df['stemmed'] = news_df.clean.apply(stem)
news_df

Unnamed: 0,category,title,original,clean,stemmed
0,business,"Sensex, Nifty end at fresh closing highs","Benchmark indices Sensex and Nifty ended at record closing highs on Wednesday. Sensex ended 195 points higher at 63,523 while the Nifty ended at 18,856.85, up 40 points. The gains were led by stocks like HDFC, Reliance Industries and TCS. During the intraday trade, Sensex rose to its fresh record high level of 63,588.",benchmark indices sensex nifty ended record closing highs wednesday sensex ended 195 points higher 63523 nifty ended 1885685 40 points gains led stocks like hdfc reliance industries tcs intraday trade sensex rose fresh record high level 63588,benchmark indic sensex nifti end record close high wednesday sensex end 195 point higher 63523 nifti end 1885685 40 point gain led stock like hdfc relianc industri tc intraday trade sensex rose fresh record high level 63588
1,business,Amazon tricked millions of customers into enrolling in Prime: US FTC,"US Federal Trade Commission (FTC) has sued Amazon, accusing it of tricking millions of consumers into signing up for its Prime subscription without their consent. ""Amazon used manipulative, coercive or deceptive user-interface designs known as 'dark patterns' to trick consumers into enrolling in automatically-renewing Prime subscriptions,"" US FTC said. Prime members in the US pay $139 per year.",us federal trade commission ftc sued amazon accusing tricking millions consumers signing prime subscription without consent amazon used manipulative coercive deceptive userinterface designs known ' dark patterns ' trick consumers enrolling automaticallyrenewing prime subscriptions us ftc said prime members us pay 139 per year,us feder trade commiss ftc su amazon accus trick million consum sign prime subscript without consent amazon use manipul coerciv decept userinterfac design known ' dark pattern ' trick consum enrol automaticallyrenew prime subscript us ftc said prime member us pay 139 per year
2,business,TIME releases list of the world's 100 most influential companies,"TIME magazine has released its annual list of the world's 100 most influential companies, which features OpenAI, SpaceX, Chess.com, Google DeepMind and Kim Kardashian's SKIMS among others. The National Payments Corporation of India (NPCI) and e-commerce platform Meesho also featured on the list. ""NPCI launched UPI...which accounted for 52% of India's digital transactions in FY22,"" TIME said.",time magazine released annual list world ' 100 influential companies features openai spacex chesscom google deepmind kim kardashian ' skims among others national payments corporation india npci ecommerce platform meesho also featured list npci launched upiwhich accounted 52 india ' digital transactions fy22 time said,time magazin releas annual list world ' 100 influenti compani featur openai spacex chesscom googl deepmind kim kardashian ' skim among other nation payment corpor india npci ecommerc platform meesho also featur list npci launch upiwhich account 52 india ' digit transact fy22 time said
3,business,Which are the world's top 10 airlines according to passengers?,"Singapore Airlines is the world's best airline, according to Skytrax World Airline Awards 2023, an annual poll of flyers released at the Paris Air Show. It is followed by Qatar Airways, All Nippon Airways, Emirates, Japan Airlines, Turkish Airlines, Air France, Cathay Pacific, EVA Air, and Korean Air. Vistara, ranked 16th, is the only Indian airline in the top 20.",singapore airlines world ' best airline according skytrax world airline awards 2023 annual poll flyers released paris air show followed qatar airways nippon airways emirates japan airlines turkish airlines air france cathay pacific eva air korean air vistara ranked 16th indian airline top 20,singapor airlin world ' best airlin accord skytrax world airlin award 2023 annual poll flyer releas pari air show follow qatar airway nippon airway emir japan airlin turkish airlin air franc cathay pacif eva air korean air vistara rank 16th indian airlin top 20
4,business,"Grab lays off over 1,000 employees","Singapore-based ride-hailing and food delivery app Grab has laid off over 1,000 employees. This is Grab's largest round of layoffs since 2020, when it cut 360 jobs in response to COVID-19 pandemic challenges. ""I want to be clear that we're not doing this as a shortcut to profitability,"" Group CEO and Co-Founder Anthony Tan said in an e-mail to employees.",singaporebased ridehailing food delivery app grab laid 1000 employees grab ' largest round layoffs since 2020 cut 360 jobs response covid19 pandemic challenges want clear ' shortcut profitability group ceo cofounder anthony tan said email employees,singaporebas ridehail food deliveri app grab laid 1000 employe grab ' largest round layoff sinc 2020 cut 360 job respons covid19 pandem challeng want clear ' shortcut profit group ceo cofound anthoni tan said email employe
...,...,...,...,...,...
95,entertainment,"Asked Alia 'What is it that H'wood has', she said money: Mahesh","Filmmaker Mahesh Bhatt said he once asked Alia Bhatt that what does Hollywood has that Indian cinema doesn't. ""Her straight reply was 'Money'. She said it with great humility. She also said that they've a way of doing things...they're very professional,"" he recalled. Commenting on Alia's Hollywood debut with 'Heart of Stone', Mahesh said, ""My heart soars with pride.""",filmmaker mahesh bhatt said asked alia bhatt hollywood indian cinema ' straight reply ' money ' said great humility also said ' way thingsthey ' professional recalled commenting alia ' hollywood debut ' heart stone ' mahesh said heart soars pride,filmmak mahesh bhatt said ask alia bhatt hollywood indian cinema ' straight repli ' money ' said great humil also said ' way thingsthey ' profession recal comment alia ' hollywood debut ' heart stone ' mahesh said heart soar pride
96,entertainment,"Two of my most favourite people: Kangana on Elon Musk, PM Modi","Actress Kangana Ranaut reacted to Elon Musk's latest statement that he's a fan of Prime Minister Narendra Modi. Sharing a picture of Elon Musk and PM Modi on Instagram, Kangana wrote, ""Two of my most favourite people. Such a lovely morning."" On the work front, Kangana is busy in promotions of 'Tiku Weds Sheru', which marks her maiden production venture.",actress kangana ranaut reacted elon musk ' latest statement ' fan prime minister narendra modi sharing picture elon musk pm modi instagram kangana wrote two favourite people lovely morning work front kangana busy promotions ' tiku weds sheru ' marks maiden production venture,actress kangana ranaut react elon musk ' latest statement ' fan prime minist narendra modi share pictur elon musk pm modi instagram kangana wrote two favourit peopl love morn work front kangana busi promot ' tiku wed sheru ' mark maiden product ventur
97,entertainment,"Thought kids at school would laugh at me, said no: Jugal on Masoom","Jugal Hansraj said that he had initially rejected Shekhar Kapur's 'Masoom'. ""When I heard the story, I thought this kid cries a lot...Everyone in my school would laugh at me...call me a crybaby...I said no,"" Hansraj shared. He added that the filmmaker insisted that he wanted Jugal in the film. ""I actually had...lot of fun,"" Hansraj said about shooting 'Masoom'.",jugal hansraj said initially rejected shekhar kapur ' ' masoom ' heard story thought kid cries loteveryone school would laugh mecall crybabyi said hansraj shared added filmmaker insisted wanted jugal film actually hadlot fun hansraj said shooting ' masoom ',jugal hansraj said initi reject shekhar kapur ' ' masoom ' heard stori thought kid cri loteveryon school would laugh mecal crybabyi said hansraj share ad filmmak insist want jugal film actual hadlot fun hansraj said shoot ' masoom '
98,entertainment,"Richard Gere attends Yoga event led by PM Modi at UN HQ, video out","Actor Richard Gere was in attendance at the Yoga event led by PM Narendra Modi on the occasion of International Yoga Day on June 21 at the United Nations Headquarters in New York. ""It is a very nice feeling here today, so open and embracing,"" Gere told reporters. Gere was among people of 135 nationalities who attended the event.",actor richard gere attendance yoga event led pm narendra modi occasion international yoga day june 21 united nations headquarters new york nice feeling today open embracing gere told reporters gere among people 135 nationalities attended event,actor richard gere attend yoga event led pm narendra modi occas intern yoga day june 21 unit nation headquart new york nice feel today open embrac gere told report gere among peopl 135 nation attend event


In [27]:
lemmatize(news_df.clean[0])

'benchmark index sensex nifty ended record closing high wednesday sensex ended 195 point higher 63523 nifty ended 1885685 40 point gain led stock like hdfc reliance industry tc intraday trade sensex rose fresh record high level 63588'

In [28]:
news_df['lemmatized'] = news_df.clean.apply(lemmatize)
news_df

Unnamed: 0,category,title,original,clean,stemmed,lemmatized
0,business,"Sensex, Nifty end at fresh closing highs","Benchmark indices Sensex and Nifty ended at record closing highs on Wednesday. Sensex ended 195 points higher at 63,523 while the Nifty ended at 18,856.85, up 40 points. The gains were led by stocks like HDFC, Reliance Industries and TCS. During the intraday trade, Sensex rose to its fresh record high level of 63,588.",benchmark indices sensex nifty ended record closing highs wednesday sensex ended 195 points higher 63523 nifty ended 1885685 40 points gains led stocks like hdfc reliance industries tcs intraday trade sensex rose fresh record high level 63588,benchmark indic sensex nifti end record close high wednesday sensex end 195 point higher 63523 nifti end 1885685 40 point gain led stock like hdfc relianc industri tc intraday trade sensex rose fresh record high level 63588,benchmark index sensex nifty ended record closing high wednesday sensex ended 195 point higher 63523 nifty ended 1885685 40 point gain led stock like hdfc reliance industry tc intraday trade sensex rose fresh record high level 63588
1,business,Amazon tricked millions of customers into enrolling in Prime: US FTC,"US Federal Trade Commission (FTC) has sued Amazon, accusing it of tricking millions of consumers into signing up for its Prime subscription without their consent. ""Amazon used manipulative, coercive or deceptive user-interface designs known as 'dark patterns' to trick consumers into enrolling in automatically-renewing Prime subscriptions,"" US FTC said. Prime members in the US pay $139 per year.",us federal trade commission ftc sued amazon accusing tricking millions consumers signing prime subscription without consent amazon used manipulative coercive deceptive userinterface designs known ' dark patterns ' trick consumers enrolling automaticallyrenewing prime subscriptions us ftc said prime members us pay 139 per year,us feder trade commiss ftc su amazon accus trick million consum sign prime subscript without consent amazon use manipul coerciv decept userinterfac design known ' dark pattern ' trick consum enrol automaticallyrenew prime subscript us ftc said prime member us pay 139 per year,u federal trade commission ftc sued amazon accusing tricking million consumer signing prime subscription without consent amazon used manipulative coercive deceptive userinterface design known ' dark pattern ' trick consumer enrolling automaticallyrenewing prime subscription u ftc said prime member u pay 139 per year
2,business,TIME releases list of the world's 100 most influential companies,"TIME magazine has released its annual list of the world's 100 most influential companies, which features OpenAI, SpaceX, Chess.com, Google DeepMind and Kim Kardashian's SKIMS among others. The National Payments Corporation of India (NPCI) and e-commerce platform Meesho also featured on the list. ""NPCI launched UPI...which accounted for 52% of India's digital transactions in FY22,"" TIME said.",time magazine released annual list world ' 100 influential companies features openai spacex chesscom google deepmind kim kardashian ' skims among others national payments corporation india npci ecommerce platform meesho also featured list npci launched upiwhich accounted 52 india ' digital transactions fy22 time said,time magazin releas annual list world ' 100 influenti compani featur openai spacex chesscom googl deepmind kim kardashian ' skim among other nation payment corpor india npci ecommerc platform meesho also featur list npci launch upiwhich account 52 india ' digit transact fy22 time said,time magazine released annual list world ' 100 influential company feature openai spacex chesscom google deepmind kim kardashian ' skim among others national payment corporation india npci ecommerce platform meesho also featured list npci launched upiwhich accounted 52 india ' digital transaction fy22 time said
3,business,Which are the world's top 10 airlines according to passengers?,"Singapore Airlines is the world's best airline, according to Skytrax World Airline Awards 2023, an annual poll of flyers released at the Paris Air Show. It is followed by Qatar Airways, All Nippon Airways, Emirates, Japan Airlines, Turkish Airlines, Air France, Cathay Pacific, EVA Air, and Korean Air. Vistara, ranked 16th, is the only Indian airline in the top 20.",singapore airlines world ' best airline according skytrax world airline awards 2023 annual poll flyers released paris air show followed qatar airways nippon airways emirates japan airlines turkish airlines air france cathay pacific eva air korean air vistara ranked 16th indian airline top 20,singapor airlin world ' best airlin accord skytrax world airlin award 2023 annual poll flyer releas pari air show follow qatar airway nippon airway emir japan airlin turkish airlin air franc cathay pacif eva air korean air vistara rank 16th indian airlin top 20,singapore airline world ' best airline according skytrax world airline award 2023 annual poll flyer released paris air show followed qatar airway nippon airway emirate japan airline turkish airline air france cathay pacific eva air korean air vistara ranked 16th indian airline top 20
4,business,"Grab lays off over 1,000 employees","Singapore-based ride-hailing and food delivery app Grab has laid off over 1,000 employees. This is Grab's largest round of layoffs since 2020, when it cut 360 jobs in response to COVID-19 pandemic challenges. ""I want to be clear that we're not doing this as a shortcut to profitability,"" Group CEO and Co-Founder Anthony Tan said in an e-mail to employees.",singaporebased ridehailing food delivery app grab laid 1000 employees grab ' largest round layoffs since 2020 cut 360 jobs response covid19 pandemic challenges want clear ' shortcut profitability group ceo cofounder anthony tan said email employees,singaporebas ridehail food deliveri app grab laid 1000 employe grab ' largest round layoff sinc 2020 cut 360 job respons covid19 pandem challeng want clear ' shortcut profit group ceo cofound anthoni tan said email employe,singaporebased ridehailing food delivery app grab laid 1000 employee grab ' largest round layoff since 2020 cut 360 job response covid19 pandemic challenge want clear ' shortcut profitability group ceo cofounder anthony tan said email employee
...,...,...,...,...,...,...
95,entertainment,"Asked Alia 'What is it that H'wood has', she said money: Mahesh","Filmmaker Mahesh Bhatt said he once asked Alia Bhatt that what does Hollywood has that Indian cinema doesn't. ""Her straight reply was 'Money'. She said it with great humility. She also said that they've a way of doing things...they're very professional,"" he recalled. Commenting on Alia's Hollywood debut with 'Heart of Stone', Mahesh said, ""My heart soars with pride.""",filmmaker mahesh bhatt said asked alia bhatt hollywood indian cinema ' straight reply ' money ' said great humility also said ' way thingsthey ' professional recalled commenting alia ' hollywood debut ' heart stone ' mahesh said heart soars pride,filmmak mahesh bhatt said ask alia bhatt hollywood indian cinema ' straight repli ' money ' said great humil also said ' way thingsthey ' profession recal comment alia ' hollywood debut ' heart stone ' mahesh said heart soar pride,filmmaker mahesh bhatt said asked alia bhatt hollywood indian cinema ' straight reply ' money ' said great humility also said ' way thingsthey ' professional recalled commenting alia ' hollywood debut ' heart stone ' mahesh said heart soar pride
96,entertainment,"Two of my most favourite people: Kangana on Elon Musk, PM Modi","Actress Kangana Ranaut reacted to Elon Musk's latest statement that he's a fan of Prime Minister Narendra Modi. Sharing a picture of Elon Musk and PM Modi on Instagram, Kangana wrote, ""Two of my most favourite people. Such a lovely morning."" On the work front, Kangana is busy in promotions of 'Tiku Weds Sheru', which marks her maiden production venture.",actress kangana ranaut reacted elon musk ' latest statement ' fan prime minister narendra modi sharing picture elon musk pm modi instagram kangana wrote two favourite people lovely morning work front kangana busy promotions ' tiku weds sheru ' marks maiden production venture,actress kangana ranaut react elon musk ' latest statement ' fan prime minist narendra modi share pictur elon musk pm modi instagram kangana wrote two favourit peopl love morn work front kangana busi promot ' tiku wed sheru ' mark maiden product ventur,actress kangana ranaut reacted elon musk ' latest statement ' fan prime minister narendra modi sharing picture elon musk pm modi instagram kangana wrote two favourite people lovely morning work front kangana busy promotion ' tiku wed sheru ' mark maiden production venture
97,entertainment,"Thought kids at school would laugh at me, said no: Jugal on Masoom","Jugal Hansraj said that he had initially rejected Shekhar Kapur's 'Masoom'. ""When I heard the story, I thought this kid cries a lot...Everyone in my school would laugh at me...call me a crybaby...I said no,"" Hansraj shared. He added that the filmmaker insisted that he wanted Jugal in the film. ""I actually had...lot of fun,"" Hansraj said about shooting 'Masoom'.",jugal hansraj said initially rejected shekhar kapur ' ' masoom ' heard story thought kid cries loteveryone school would laugh mecall crybabyi said hansraj shared added filmmaker insisted wanted jugal film actually hadlot fun hansraj said shooting ' masoom ',jugal hansraj said initi reject shekhar kapur ' ' masoom ' heard stori thought kid cri loteveryon school would laugh mecal crybabyi said hansraj share ad filmmak insist want jugal film actual hadlot fun hansraj said shoot ' masoom ',jugal hansraj said initially rejected shekhar kapur ' ' masoom ' heard story thought kid cry loteveryone school would laugh mecall crybabyi said hansraj shared added filmmaker insisted wanted jugal film actually hadlot fun hansraj said shooting ' masoom '
98,entertainment,"Richard Gere attends Yoga event led by PM Modi at UN HQ, video out","Actor Richard Gere was in attendance at the Yoga event led by PM Narendra Modi on the occasion of International Yoga Day on June 21 at the United Nations Headquarters in New York. ""It is a very nice feeling here today, so open and embracing,"" Gere told reporters. Gere was among people of 135 nationalities who attended the event.",actor richard gere attendance yoga event led pm narendra modi occasion international yoga day june 21 united nations headquarters new york nice feeling today open embracing gere told reporters gere among people 135 nationalities attended event,actor richard gere attend yoga event led pm narendra modi occas intern yoga day june 21 unit nation headquart new york nice feel today open embrac gere told report gere among peopl 135 nation attend event,actor richard gere attendance yoga event led pm narendra modi occasion international yoga day june 21 united nation headquarters new york nice feeling today open embracing gere told reporter gere among people 135 nationality attended event


## codeup_df DATAFRAME

In [29]:
# df.rename columns from content to original
codeup_df = codeup_df.rename(columns={'content':'original'})
codeup_df

Unnamed: 0,title,original
0,Spotlight on APIDA Voices: Celebrating Heritage and Inspiring Change ft. Arbeena Thapa,"May is traditionally known as Asian American and Pacific Islander (AAPI) Heritage Month. This month we celebrate the history and contributions made possible by our AAPI friends, family, and community. We also examine our level of support and seek opportunities to better understand the AAPI community. In an effort to address real concerns and experiences, we sat down with Arbeena Thapa, one of Codeup’s Financial Aid and Enrollment Managers. Arbeena identifies as Nepali American and Desi. Arbeena’s parents immigrated to Texas in 1988 for better employment and educational opportunities. Arbeena’s older sister was five when they made the move to the US. Arbeena was born later, becoming the first in her family to be a US citizen. At Codeup we take our efforts at inclusivity very seriously. After speaking with Arbeena, we were taught that the term AAPI excludes Desi-American individuals. Hence, we will now use the term Asian Pacific Islander Desi American (APIDA). Here is how the rest of our conversation with Arbeena went! How do you celebrate or connect with your heritage and cultural traditions? “I celebrate Nepal’s version of Christmas or Dashain. This is a nine-day celebration also known as Dussehra. I grew up as Hindu and I identify as Hindu, this is a very large part of my heritage. “ “Other ways I connect with my culture include sharing food! Momos are South Asian Dumplings and they’re my favorite to make and share.” “On my Asian American side, I am an advocate of immigrant justice and erasure within APIDA social or political movements. I participate in events to embrace my identity such as immigrant justice advocacy because I come from a mixed-status family. I’ve always been in a community with undocumented Asian immigrants. .” What are some of the challenges you have faced as an APIDA individual, personally or professionally? “I often struggle with being gendered as compliant or a pushover. Professionally, I am often stereotyped as meek, so I’ve been overlooked for leadership roles. We are seen as perpetually foreign; people tend to other us in that way, yet put us on a pedestal for what a model minority looks like. This has made me hesitant to share my heritage in the past because these assumptions get mapped onto me. ” Can you describe some common barriers of entry that APIDA individuals, specifically women may face when trying to enter or advance in the workplace? “Being overlooked for leadership. In the past, I have not been viewed as a leader. People sometimes have preconceived stereotypes of Asian women not being able to be bold, or being vocal can be mistaken for being too emotional. “ How do you believe microaggressions impact APIDA individuals in the workplace? Can you provide examples of such microaggressions? “Erasure is big. To me, only saying ‘Merry Christmas’ isn’t inclusive to other religions. People are often resistant to saying ‘Happy Holidays,’ but saying Merry Christmas excludes, and does not appreciate my heritage. “ “Often microaggressions are not micro at all. They typically are not aggressive racialized violence, but the term ‘micro’ minimizes impact.” “Some that I’ve heard are ‘What kind of Asian are you?’ or ‘Where are you from?’ This automatically makes me the ‘other’ and not seen as American. Even within the APIDA community, South Asians are overlooked as “Asian”.” How important is representation, specifically APIDA representation, in organizational leadership positions? “I want to say that it is important to have someone who looks like you in leadership roles, and it is, but those leaders may not share the same beliefs as you. Certain privileges such as wealth, resources, or lack of interaction with lower-socioeconomic-status Asian Americans may cause a difference in community politics. I do not think the bamboo ceiling is acceptable, but the company you work for plays a big part in your politics and belief alignment.” How do you feel about code-switching, and have you ever felt it necessary to code-switch? “I like sharing South Asian terms or connecting with others that have similar heritage and culture. A workplace that is welcoming to going into this sort of breakout is refreshing and makes space for us. However, having to code-switch could also mean a workplace that is not conducive and welcoming of other cultures. “ Finally, in your opinion, what long-term strategies can create lasting change in the workplace and ensure support, equality, and inclusion for APIDA individuals? “Prior to a career in financial aid, I did a lot of research related to the post-9/11 immigration of the South Asian diaspora. This background made me heavily rely on grassroots organizing. Hire the people that want to innovate, hire the changemakers, hire the button-pushers. Reduce reliance on whiteness as change. This will become natural for the organization and become organizational change. Change comes from us on the ground.” A huge thank you to Arbeena Thapa for sharing her experiences, and being vulnerable with us. Your words were inspiring and the opportunity to understand your perspective more has been valuable. We hope we can become better support for the APIDA community as we learn and grow on our journey of cultivating inclusive growth."
1,Women in tech: Panelist Spotlight – Magdalena Rahn,"Women in tech: Panelist Spotlight – Magdalena Rahn Codeup is hosting a Women in Tech Panel in honor of Women’s History Month on March 29th, 2023! To further celebrate, we’d like to spotlight each of our panelists leading up to the discussion to learn a bit about their respective experiences as women in the tech industry! Meet Magdalena! Magdalena Rahn is a current Codeup student in a Data Science cohort in San Antonio, Texas. She has a professional background in cross-cultural communications, international business development, the wine industry and journalism. After serving in the US Navy, she decided to complement her professional skill set by attending the Data Science program at Codeup; she is set to graduate in March 2023. Magdalena is fluent in French, Bulgarian, Chinese-Mandarin, Spanish and Italian. We asked Magdalena how Codeup impacted her career, and she replied “Codeup has provided a solid foundation in analytical processes, programming and data science methods, and it’s been an encouragement to have such supportive instructors and wonderful classmates.” Don’t forget to tune in on March 29th to sit in on an insightful conversation with Magdalena."
2,Women in tech: Panelist Spotlight – Rachel Robbins-Mayhill,"Women in tech: Panelist Spotlight – Rachel Robbins-Mayhill Codeup is hosting a Women in Tech Panel in honor of Women’s History Month on March 29th, 2023! To further celebrate, we’d like to spotlight each of our panelists leading up to the discussion to learn a bit about their respective experiences as women in the tech industry! Meet Rachel! Rachel Robbins-Mayhill is a Decision Science Analyst I in San Antonio, Texas. Rachel has had a varied career that includes counseling, teaching, training, community development, and military operations. Her focus has always been on assessing needs, identifying solutions, and educating individuals and groups on aligning needs and solutions in different contexts. Rachel’s passion for data science stems from her belief that data is a powerful tool for communicating patterns that can lead to hope and growth in the future. In June 2022, Rachel graduated from Codeup’s Innis cohort, where she honed her skills in data science. Shortly after, she started working as a Data Science Technical Writer with Apex Systems as a Contractor for USAA in July 2022. Her unconventional role allowed her to understand where her skills could be best utilized to support USAA in a non-contract role. Rachel recently joined USAA’s Data Science Delivery team as a Decision Science Analyst I in February 2023. The team is focused on delivering machine learning models for fraud prevention, and Rachel’s particular role centers around providing strategic process solutions for the team in collaboration with Operational and Model Risk components. In addition to her career, Rachel is currently pursuing a master’s degree in Applied Data Science from Syracuse University, further expanding her knowledge and skills in the field. Rachel is passionate about collaborating with individuals who share her belief in the potential of others and strive to achieve growth through logical, informed action. She welcomes LinkedIn connections and is excited about supporting the network of CodeUp alumni! We asked Rachel how Codeup impacted her career, and she replied “Codeup delivered a comprehensive education in all facets of the data science pipeline, laying a strong foundation for me to build upon. Through repeated hands-on practice, I developed a reliable process that was immediately applicable in my job. Collaborative group projects were instrumental in helping me hone my skills in project management, allowing me to navigate complex data science projects with comfortability. Thanks to this invaluable experience, I was able to make significant strides in my career within just six months of graduating from Codeup.” Don’t forget to tune in on March 29th to sit in on an insightful conversation."
3,Women in Tech: Panelist Spotlight – Sarah Mellor,"Women in tech: Panelist Spotlight – Sarah Mellor Codeup is hosting a Women in Tech Panel in honor of Women’s History Month on March 29th, 2023! To further celebrate, we’d like to spotlight each of our panelists leading up to the discussion to learn a bit about their respective experiences as women in the tech industry! Meet Sarah! Sarah Mellor currently works as the Director of People Operations. She joined Codeup four and a half years ago as an Admissions Manager. She went on to build out and lead the Marketing and Admissions team, while picking up People Ops tasks and projects here and there until moving over to lead the People Ops team two years ago. Prior to Codeup, she worked at education-focused non-profits in Washington, DC and Boulder, Colorado. She graduated from Wake Forest University. We asked Sarah how Codeup has impacted her career, and her response was “I have absolutely loved having the privilege to grow alongside Codeup. In my time here across multiple different roles and departments, I’ve seen a lot of change. The consistent things have always been the high quality of passionate and hardworking people I get to work with; the impactful mission we get to work on; and the inspiring students who trust us with their career change.” Don’t forget to tune in on March 29th to sit in on an insightful conversation."
4,Women in Tech: Panelist Spotlight – Madeleine Capper,"Women in tech: Panelist Spotlight – Madeleine Capper Codeup is hosting a Women in Tech Panel in honor of Women’s History Month on March 29th, 2023! To further celebrate, we’d like to spotlight each of our panelists leading up to the discussion to learn a bit about their respective experiences as women in the tech industry! Meet Madeleine! Madeleine Capper is a Data Scientist in San Antonio, Texas. A long-standing San Antonio resident, she studied mathematics at the University of Texas San Antonio and has worked as a Data Scientist for Booz Allen Hamilton. Madeleine currently teaches Data Science at Codeup, where she works daily with burgeoning data professionals to help them actualize their career aspirations through technical education. Madeleine attended Codeup as a student in early 2019 as a pupil in the very first Codeup Data Science cohort. The program proved immediately effective and she was the first student to obtain a data career out of the program. After working at Booz Allen Hamilton, Madeleine’s passion for education in conjunction with her appreciation for Codeup’s capacity for transformative life change brought her back to the institution in an instructional capacity, where she has been teaching for two years. Don’t forget to tune in on March 29th to sit in on an insightful conversation."
5,Black Excellence in Tech: Panelist Spotlight – Wilmarie De La Cruz Mejia,"Black excellence in tech: Panelist Spotlight – Wilmarie De La Cruz Mejia Codeup is hosting a Black Excellence in Tech Panel in honor of Black History Month on February 22, 2023! To further celebrate, we’d like to spotlight each of our panelists leading up to the discussion to learn a bit about their respective experiences as black leaders in the tech industry! Meet Wilmarie! Wilmarie De La Cruz Mejia is a current Codeup student on the path to becoming a Full-Stack Web Developer at our Dallas, TX campus. Wilmarie is a veteran expanding her knowledge of programming languages and technologies on her journey with Codeup. We asked Wilmarie to share more about her experience at Codeup. She shares, “I was able to meet other people who were passionate about coding and be in a positive learning environment.” We hope you can join us on February 22nd to sit in on an insightful conversation with Wilmarie and all of our panelists!"


In [30]:
# combine basic and tokenize functions
basic_clean(tokenize(remove_stopwords(codeup_df.original[0], extra_words, exclude_words)))

'may traditionally known asian american pacific islander  aapi  heritage month this month celebrate history contributions made possible aapi friends  family  community we also examine level support seek opportunities better understand aapi community in effort address real concerns experiences  sat arbeena thapa  one codeup  s financial aid enrollment managers arbeena identifies nepali american desi arbeena  s parents immigrated texas 1988 better employment educational opportunities arbeena  s older sister five made move us arbeena born later  becoming first family us citizen at codeup take efforts inclusivity seriously after speaking arbeena  taught term aapi excludes desiamerican individuals hence  use term asian pacific islander desi american  apida   here rest conversation arbeena went  how celebrate connect heritage cultural traditions  i celebrate nepal  s version christmas dashain this nineday celebration also known dussehra i grew hindu i identify hindu  large part heritage  oth

In [31]:
codeup_df['clean'] = codeup_df.original.apply(basic_clean).apply(tokenize).apply(remove_stopwords, extra_words=extra_words, exclude_words=exclude_words)
codeup_df.head()

Unnamed: 0,title,original,clean
0,Spotlight on APIDA Voices: Celebrating Heritage and Inspiring Change ft. Arbeena Thapa,"May is traditionally known as Asian American and Pacific Islander (AAPI) Heritage Month. This month we celebrate the history and contributions made possible by our AAPI friends, family, and community. We also examine our level of support and seek opportunities to better understand the AAPI community. In an effort to address real concerns and experiences, we sat down with Arbeena Thapa, one of Codeup’s Financial Aid and Enrollment Managers. Arbeena identifies as Nepali American and Desi. Arbeena’s parents immigrated to Texas in 1988 for better employment and educational opportunities. Arbeena’s older sister was five when they made the move to the US. Arbeena was born later, becoming the first in her family to be a US citizen. At Codeup we take our efforts at inclusivity very seriously. After speaking with Arbeena, we were taught that the term AAPI excludes Desi-American individuals. Hence, we will now use the term Asian Pacific Islander Desi American (APIDA). Here is how the rest of our conversation with Arbeena went! How do you celebrate or connect with your heritage and cultural traditions? “I celebrate Nepal’s version of Christmas or Dashain. This is a nine-day celebration also known as Dussehra. I grew up as Hindu and I identify as Hindu, this is a very large part of my heritage. “ “Other ways I connect with my culture include sharing food! Momos are South Asian Dumplings and they’re my favorite to make and share.” “On my Asian American side, I am an advocate of immigrant justice and erasure within APIDA social or political movements. I participate in events to embrace my identity such as immigrant justice advocacy because I come from a mixed-status family. I’ve always been in a community with undocumented Asian immigrants. .” What are some of the challenges you have faced as an APIDA individual, personally or professionally? “I often struggle with being gendered as compliant or a pushover. Professionally, I am often stereotyped as meek, so I’ve been overlooked for leadership roles. We are seen as perpetually foreign; people tend to other us in that way, yet put us on a pedestal for what a model minority looks like. This has made me hesitant to share my heritage in the past because these assumptions get mapped onto me. ” Can you describe some common barriers of entry that APIDA individuals, specifically women may face when trying to enter or advance in the workplace? “Being overlooked for leadership. In the past, I have not been viewed as a leader. People sometimes have preconceived stereotypes of Asian women not being able to be bold, or being vocal can be mistaken for being too emotional. “ How do you believe microaggressions impact APIDA individuals in the workplace? Can you provide examples of such microaggressions? “Erasure is big. To me, only saying ‘Merry Christmas’ isn’t inclusive to other religions. People are often resistant to saying ‘Happy Holidays,’ but saying Merry Christmas excludes, and does not appreciate my heritage. “ “Often microaggressions are not micro at all. They typically are not aggressive racialized violence, but the term ‘micro’ minimizes impact.” “Some that I’ve heard are ‘What kind of Asian are you?’ or ‘Where are you from?’ This automatically makes me the ‘other’ and not seen as American. Even within the APIDA community, South Asians are overlooked as “Asian”.” How important is representation, specifically APIDA representation, in organizational leadership positions? “I want to say that it is important to have someone who looks like you in leadership roles, and it is, but those leaders may not share the same beliefs as you. Certain privileges such as wealth, resources, or lack of interaction with lower-socioeconomic-status Asian Americans may cause a difference in community politics. I do not think the bamboo ceiling is acceptable, but the company you work for plays a big part in your politics and belief alignment.” How do you feel about code-switching, and have you ever felt it necessary to code-switch? “I like sharing South Asian terms or connecting with others that have similar heritage and culture. A workplace that is welcoming to going into this sort of breakout is refreshing and makes space for us. However, having to code-switch could also mean a workplace that is not conducive and welcoming of other cultures. “ Finally, in your opinion, what long-term strategies can create lasting change in the workplace and ensure support, equality, and inclusion for APIDA individuals? “Prior to a career in financial aid, I did a lot of research related to the post-9/11 immigration of the South Asian diaspora. This background made me heavily rely on grassroots organizing. Hire the people that want to innovate, hire the changemakers, hire the button-pushers. Reduce reliance on whiteness as change. This will become natural for the organization and become organizational change. Change comes from us on the ground.” A huge thank you to Arbeena Thapa for sharing her experiences, and being vulnerable with us. Your words were inspiring and the opportunity to understand your perspective more has been valuable. We hope we can become better support for the APIDA community as we learn and grow on our journey of cultivating inclusive growth.",may traditionally known asian american pacific islander aapi heritage month month celebrate history contributions made possible aapi friends family community also examine level support seek opportunities better understand aapi community effort address real concerns experiences sat arbeena thapa one codeups financial aid enrollment managers arbeena identifies nepali american desi arbeenas parents immigrated texas 1988 better employment educational opportunities arbeenas older sister five made move us arbeena born later becoming first family us citizen codeup take efforts inclusivity seriously speaking arbeena taught term aapi excludes desiamerican individuals hence use term asian pacific islander desi american apida rest conversation arbeena went celebrate connect heritage cultural traditions celebrate nepals version christmas dashain nineday celebration also known dussehra grew hindu identify hindu large part heritage ways connect culture include sharing food momos south asian dumplings theyre favorite make share asian american side advocate immigrant justice erasure within apida social political movements participate events embrace identity immigrant justice advocacy come mixedstatus family ive always community undocumented asian immigrants challenges faced apida individual personally professionally often struggle gendered compliant pushover professionally often stereotyped meek ive overlooked leadership roles seen perpetually foreign people tend us way yet put us pedestal model minority looks like made hesitant share heritage past assumptions get mapped onto describe common barriers entry apida individuals specifically women may face trying enter advance workplace overlooked leadership past viewed leader people sometimes preconceived stereotypes asian women able bold vocal mistaken emotional believe microaggressions impact apida individuals workplace provide examples microaggressions erasure big saying merry christmas isnt inclusive religions people often resistant saying happy holidays saying merry christmas excludes appreciate heritage often microaggressions micro typically aggressive racialized violence term micro minimizes impact ive heard kind asian automatically makes seen american even within apida community south asians overlooked asian important representation specifically apida representation organizational leadership positions want say important someone looks like leadership roles leaders may share beliefs certain privileges wealth resources lack interaction lowersocioeconomicstatus asian americans may cause difference community politics think bamboo ceiling acceptable company work plays big part politics belief alignment feel codeswitching ever felt necessary codeswitch like sharing south asian terms connecting others similar heritage culture workplace welcoming going sort breakout refreshing makes space us however codeswitch could also mean workplace conducive welcoming cultures finally opinion longterm strategies create lasting change workplace ensure support equality inclusion apida individuals prior career financial aid lot research related post911 immigration south asian diaspora background made heavily rely grassroots organizing hire people want innovate hire changemakers hire buttonpushers reduce reliance whiteness change become natural organization become organizational change change comes us ground huge thank arbeena thapa sharing experiences vulnerable us words inspiring opportunity understand perspective valuable hope become better support apida community learn grow journey cultivating inclusive growth
1,Women in tech: Panelist Spotlight – Magdalena Rahn,"Women in tech: Panelist Spotlight – Magdalena Rahn Codeup is hosting a Women in Tech Panel in honor of Women’s History Month on March 29th, 2023! To further celebrate, we’d like to spotlight each of our panelists leading up to the discussion to learn a bit about their respective experiences as women in the tech industry! Meet Magdalena! Magdalena Rahn is a current Codeup student in a Data Science cohort in San Antonio, Texas. She has a professional background in cross-cultural communications, international business development, the wine industry and journalism. After serving in the US Navy, she decided to complement her professional skill set by attending the Data Science program at Codeup; she is set to graduate in March 2023. Magdalena is fluent in French, Bulgarian, Chinese-Mandarin, Spanish and Italian. We asked Magdalena how Codeup impacted her career, and she replied “Codeup has provided a solid foundation in analytical processes, programming and data science methods, and it’s been an encouragement to have such supportive instructors and wonderful classmates.” Don’t forget to tune in on March 29th to sit in on an insightful conversation with Magdalena.",women tech panelist spotlight magdalena rahn codeup hosting women tech panel honor womens history month march 29th 2023 celebrate wed like spotlight panelists leading discussion learn bit respective experiences women tech industry meet magdalena magdalena rahn current codeup student data science cohort san antonio texas professional background crosscultural communications international business development wine industry journalism serving us navy decided complement professional skill set attending data science program codeup set graduate march 2023 magdalena fluent french bulgarian chinesemandarin spanish italian asked magdalena codeup impacted career replied codeup provided solid foundation analytical processes programming data science methods encouragement supportive instructors wonderful classmates dont forget tune march 29th sit insightful conversation magdalena
2,Women in tech: Panelist Spotlight – Rachel Robbins-Mayhill,"Women in tech: Panelist Spotlight – Rachel Robbins-Mayhill Codeup is hosting a Women in Tech Panel in honor of Women’s History Month on March 29th, 2023! To further celebrate, we’d like to spotlight each of our panelists leading up to the discussion to learn a bit about their respective experiences as women in the tech industry! Meet Rachel! Rachel Robbins-Mayhill is a Decision Science Analyst I in San Antonio, Texas. Rachel has had a varied career that includes counseling, teaching, training, community development, and military operations. Her focus has always been on assessing needs, identifying solutions, and educating individuals and groups on aligning needs and solutions in different contexts. Rachel’s passion for data science stems from her belief that data is a powerful tool for communicating patterns that can lead to hope and growth in the future. In June 2022, Rachel graduated from Codeup’s Innis cohort, where she honed her skills in data science. Shortly after, she started working as a Data Science Technical Writer with Apex Systems as a Contractor for USAA in July 2022. Her unconventional role allowed her to understand where her skills could be best utilized to support USAA in a non-contract role. Rachel recently joined USAA’s Data Science Delivery team as a Decision Science Analyst I in February 2023. The team is focused on delivering machine learning models for fraud prevention, and Rachel’s particular role centers around providing strategic process solutions for the team in collaboration with Operational and Model Risk components. In addition to her career, Rachel is currently pursuing a master’s degree in Applied Data Science from Syracuse University, further expanding her knowledge and skills in the field. Rachel is passionate about collaborating with individuals who share her belief in the potential of others and strive to achieve growth through logical, informed action. She welcomes LinkedIn connections and is excited about supporting the network of CodeUp alumni! We asked Rachel how Codeup impacted her career, and she replied “Codeup delivered a comprehensive education in all facets of the data science pipeline, laying a strong foundation for me to build upon. Through repeated hands-on practice, I developed a reliable process that was immediately applicable in my job. Collaborative group projects were instrumental in helping me hone my skills in project management, allowing me to navigate complex data science projects with comfortability. Thanks to this invaluable experience, I was able to make significant strides in my career within just six months of graduating from Codeup.” Don’t forget to tune in on March 29th to sit in on an insightful conversation.",women tech panelist spotlight rachel robbinsmayhill codeup hosting women tech panel honor womens history month march 29th 2023 celebrate wed like spotlight panelists leading discussion learn bit respective experiences women tech industry meet rachel rachel robbinsmayhill decision science analyst san antonio texas rachel varied career includes counseling teaching training community development military operations focus always assessing needs identifying solutions educating individuals groups aligning needs solutions different contexts rachels passion data science stems belief data powerful tool communicating patterns lead hope growth future june 2022 rachel graduated codeups innis cohort honed skills data science shortly started working data science technical writer apex systems contractor usaa july 2022 unconventional role allowed understand skills could best utilized support usaa noncontract role rachel recently joined usaas data science delivery team decision science analyst february 2023 team focused delivering machine learning models fraud prevention rachels particular role centers around providing strategic process solutions team collaboration operational model risk components addition career rachel currently pursuing masters degree applied data science syracuse university expanding knowledge skills field rachel passionate collaborating individuals share belief potential others strive achieve growth logical informed action welcomes linkedin connections excited supporting network codeup alumni asked rachel codeup impacted career replied codeup delivered comprehensive education facets data science pipeline laying strong foundation build upon repeated handson practice developed reliable process immediately applicable job collaborative group projects instrumental helping hone skills project management allowing navigate complex data science projects comfortability thanks invaluable experience able make significant strides career within six months graduating codeup dont forget tune march 29th sit insightful conversation
3,Women in Tech: Panelist Spotlight – Sarah Mellor,"Women in tech: Panelist Spotlight – Sarah Mellor Codeup is hosting a Women in Tech Panel in honor of Women’s History Month on March 29th, 2023! To further celebrate, we’d like to spotlight each of our panelists leading up to the discussion to learn a bit about their respective experiences as women in the tech industry! Meet Sarah! Sarah Mellor currently works as the Director of People Operations. She joined Codeup four and a half years ago as an Admissions Manager. She went on to build out and lead the Marketing and Admissions team, while picking up People Ops tasks and projects here and there until moving over to lead the People Ops team two years ago. Prior to Codeup, she worked at education-focused non-profits in Washington, DC and Boulder, Colorado. She graduated from Wake Forest University. We asked Sarah how Codeup has impacted her career, and her response was “I have absolutely loved having the privilege to grow alongside Codeup. In my time here across multiple different roles and departments, I’ve seen a lot of change. The consistent things have always been the high quality of passionate and hardworking people I get to work with; the impactful mission we get to work on; and the inspiring students who trust us with their career change.” Don’t forget to tune in on March 29th to sit in on an insightful conversation.",women tech panelist spotlight sarah mellor codeup hosting women tech panel honor womens history month march 29th 2023 celebrate wed like spotlight panelists leading discussion learn bit respective experiences women tech industry meet sarah sarah mellor currently works director people operations joined codeup four half years ago admissions manager went build lead marketing admissions team picking people ops tasks projects moving lead people ops team two years ago prior codeup worked educationfocused nonprofits washington dc boulder colorado graduated wake forest university asked sarah codeup impacted career response absolutely loved privilege grow alongside codeup time across multiple different roles departments ive seen lot change consistent things always high quality passionate hardworking people get work impactful mission get work inspiring students trust us career change dont forget tune march 29th sit insightful conversation
4,Women in Tech: Panelist Spotlight – Madeleine Capper,"Women in tech: Panelist Spotlight – Madeleine Capper Codeup is hosting a Women in Tech Panel in honor of Women’s History Month on March 29th, 2023! To further celebrate, we’d like to spotlight each of our panelists leading up to the discussion to learn a bit about their respective experiences as women in the tech industry! Meet Madeleine! Madeleine Capper is a Data Scientist in San Antonio, Texas. A long-standing San Antonio resident, she studied mathematics at the University of Texas San Antonio and has worked as a Data Scientist for Booz Allen Hamilton. Madeleine currently teaches Data Science at Codeup, where she works daily with burgeoning data professionals to help them actualize their career aspirations through technical education. Madeleine attended Codeup as a student in early 2019 as a pupil in the very first Codeup Data Science cohort. The program proved immediately effective and she was the first student to obtain a data career out of the program. After working at Booz Allen Hamilton, Madeleine’s passion for education in conjunction with her appreciation for Codeup’s capacity for transformative life change brought her back to the institution in an instructional capacity, where she has been teaching for two years. Don’t forget to tune in on March 29th to sit in on an insightful conversation.",women tech panelist spotlight madeleine capper codeup hosting women tech panel honor womens history month march 29th 2023 celebrate wed like spotlight panelists leading discussion learn bit respective experiences women tech industry meet madeleine madeleine capper data scientist san antonio texas longstanding san antonio resident studied mathematics university texas san antonio worked data scientist booz allen hamilton madeleine currently teaches data science codeup works daily burgeoning data professionals help actualize career aspirations technical education madeleine attended codeup student early 2019 pupil first codeup data science cohort program proved immediately effective first student obtain data career program working booz allen hamilton madeleines passion education conjunction appreciation codeups capacity transformative life change brought back institution instructional capacity teaching two years dont forget tune march 29th sit insightful conversation


In [32]:
stem(codeup_df.clean[0])

'may tradit known asian american pacif island aapi heritag month month celebr histori contribut made possibl aapi friend famili commun also examin level support seek opportun better understand aapi commun effort address real concern experi sat arbeena thapa one codeup financi aid enrol manag arbeena identifi nepali american desi arbeena parent immigr texa 1988 better employ educ opportun arbeena older sister five made move us arbeena born later becom first famili us citizen codeup take effort inclus serious speak arbeena taught term aapi exclud desiamerican individu henc use term asian pacif island desi american apida rest convers arbeena went celebr connect heritag cultur tradit celebr nepal version christma dashain nineday celebr also known dussehra grew hindu identifi hindu larg part heritag way connect cultur includ share food momo south asian dumpl theyr favorit make share asian american side advoc immigr justic erasur within apida social polit movement particip event embrac ident

In [33]:
codeup_df['stemmed'] = codeup_df.clean.apply(stem)
codeup_df

Unnamed: 0,title,original,clean,stemmed
0,Spotlight on APIDA Voices: Celebrating Heritage and Inspiring Change ft. Arbeena Thapa,"May is traditionally known as Asian American and Pacific Islander (AAPI) Heritage Month. This month we celebrate the history and contributions made possible by our AAPI friends, family, and community. We also examine our level of support and seek opportunities to better understand the AAPI community. In an effort to address real concerns and experiences, we sat down with Arbeena Thapa, one of Codeup’s Financial Aid and Enrollment Managers. Arbeena identifies as Nepali American and Desi. Arbeena’s parents immigrated to Texas in 1988 for better employment and educational opportunities. Arbeena’s older sister was five when they made the move to the US. Arbeena was born later, becoming the first in her family to be a US citizen. At Codeup we take our efforts at inclusivity very seriously. After speaking with Arbeena, we were taught that the term AAPI excludes Desi-American individuals. Hence, we will now use the term Asian Pacific Islander Desi American (APIDA). Here is how the rest of our conversation with Arbeena went! How do you celebrate or connect with your heritage and cultural traditions? “I celebrate Nepal’s version of Christmas or Dashain. This is a nine-day celebration also known as Dussehra. I grew up as Hindu and I identify as Hindu, this is a very large part of my heritage. “ “Other ways I connect with my culture include sharing food! Momos are South Asian Dumplings and they’re my favorite to make and share.” “On my Asian American side, I am an advocate of immigrant justice and erasure within APIDA social or political movements. I participate in events to embrace my identity such as immigrant justice advocacy because I come from a mixed-status family. I’ve always been in a community with undocumented Asian immigrants. .” What are some of the challenges you have faced as an APIDA individual, personally or professionally? “I often struggle with being gendered as compliant or a pushover. Professionally, I am often stereotyped as meek, so I’ve been overlooked for leadership roles. We are seen as perpetually foreign; people tend to other us in that way, yet put us on a pedestal for what a model minority looks like. This has made me hesitant to share my heritage in the past because these assumptions get mapped onto me. ” Can you describe some common barriers of entry that APIDA individuals, specifically women may face when trying to enter or advance in the workplace? “Being overlooked for leadership. In the past, I have not been viewed as a leader. People sometimes have preconceived stereotypes of Asian women not being able to be bold, or being vocal can be mistaken for being too emotional. “ How do you believe microaggressions impact APIDA individuals in the workplace? Can you provide examples of such microaggressions? “Erasure is big. To me, only saying ‘Merry Christmas’ isn’t inclusive to other religions. People are often resistant to saying ‘Happy Holidays,’ but saying Merry Christmas excludes, and does not appreciate my heritage. “ “Often microaggressions are not micro at all. They typically are not aggressive racialized violence, but the term ‘micro’ minimizes impact.” “Some that I’ve heard are ‘What kind of Asian are you?’ or ‘Where are you from?’ This automatically makes me the ‘other’ and not seen as American. Even within the APIDA community, South Asians are overlooked as “Asian”.” How important is representation, specifically APIDA representation, in organizational leadership positions? “I want to say that it is important to have someone who looks like you in leadership roles, and it is, but those leaders may not share the same beliefs as you. Certain privileges such as wealth, resources, or lack of interaction with lower-socioeconomic-status Asian Americans may cause a difference in community politics. I do not think the bamboo ceiling is acceptable, but the company you work for plays a big part in your politics and belief alignment.” How do you feel about code-switching, and have you ever felt it necessary to code-switch? “I like sharing South Asian terms or connecting with others that have similar heritage and culture. A workplace that is welcoming to going into this sort of breakout is refreshing and makes space for us. However, having to code-switch could also mean a workplace that is not conducive and welcoming of other cultures. “ Finally, in your opinion, what long-term strategies can create lasting change in the workplace and ensure support, equality, and inclusion for APIDA individuals? “Prior to a career in financial aid, I did a lot of research related to the post-9/11 immigration of the South Asian diaspora. This background made me heavily rely on grassroots organizing. Hire the people that want to innovate, hire the changemakers, hire the button-pushers. Reduce reliance on whiteness as change. This will become natural for the organization and become organizational change. Change comes from us on the ground.” A huge thank you to Arbeena Thapa for sharing her experiences, and being vulnerable with us. Your words were inspiring and the opportunity to understand your perspective more has been valuable. We hope we can become better support for the APIDA community as we learn and grow on our journey of cultivating inclusive growth.",may traditionally known asian american pacific islander aapi heritage month month celebrate history contributions made possible aapi friends family community also examine level support seek opportunities better understand aapi community effort address real concerns experiences sat arbeena thapa one codeups financial aid enrollment managers arbeena identifies nepali american desi arbeenas parents immigrated texas 1988 better employment educational opportunities arbeenas older sister five made move us arbeena born later becoming first family us citizen codeup take efforts inclusivity seriously speaking arbeena taught term aapi excludes desiamerican individuals hence use term asian pacific islander desi american apida rest conversation arbeena went celebrate connect heritage cultural traditions celebrate nepals version christmas dashain nineday celebration also known dussehra grew hindu identify hindu large part heritage ways connect culture include sharing food momos south asian dumplings theyre favorite make share asian american side advocate immigrant justice erasure within apida social political movements participate events embrace identity immigrant justice advocacy come mixedstatus family ive always community undocumented asian immigrants challenges faced apida individual personally professionally often struggle gendered compliant pushover professionally often stereotyped meek ive overlooked leadership roles seen perpetually foreign people tend us way yet put us pedestal model minority looks like made hesitant share heritage past assumptions get mapped onto describe common barriers entry apida individuals specifically women may face trying enter advance workplace overlooked leadership past viewed leader people sometimes preconceived stereotypes asian women able bold vocal mistaken emotional believe microaggressions impact apida individuals workplace provide examples microaggressions erasure big saying merry christmas isnt inclusive religions people often resistant saying happy holidays saying merry christmas excludes appreciate heritage often microaggressions micro typically aggressive racialized violence term micro minimizes impact ive heard kind asian automatically makes seen american even within apida community south asians overlooked asian important representation specifically apida representation organizational leadership positions want say important someone looks like leadership roles leaders may share beliefs certain privileges wealth resources lack interaction lowersocioeconomicstatus asian americans may cause difference community politics think bamboo ceiling acceptable company work plays big part politics belief alignment feel codeswitching ever felt necessary codeswitch like sharing south asian terms connecting others similar heritage culture workplace welcoming going sort breakout refreshing makes space us however codeswitch could also mean workplace conducive welcoming cultures finally opinion longterm strategies create lasting change workplace ensure support equality inclusion apida individuals prior career financial aid lot research related post911 immigration south asian diaspora background made heavily rely grassroots organizing hire people want innovate hire changemakers hire buttonpushers reduce reliance whiteness change become natural organization become organizational change change comes us ground huge thank arbeena thapa sharing experiences vulnerable us words inspiring opportunity understand perspective valuable hope become better support apida community learn grow journey cultivating inclusive growth,may tradit known asian american pacif island aapi heritag month month celebr histori contribut made possibl aapi friend famili commun also examin level support seek opportun better understand aapi commun effort address real concern experi sat arbeena thapa one codeup financi aid enrol manag arbeena identifi nepali american desi arbeena parent immigr texa 1988 better employ educ opportun arbeena older sister five made move us arbeena born later becom first famili us citizen codeup take effort inclus serious speak arbeena taught term aapi exclud desiamerican individu henc use term asian pacif island desi american apida rest convers arbeena went celebr connect heritag cultur tradit celebr nepal version christma dashain nineday celebr also known dussehra grew hindu identifi hindu larg part heritag way connect cultur includ share food momo south asian dumpl theyr favorit make share asian american side advoc immigr justic erasur within apida social polit movement particip event embrac ident immigr justic advocaci come mixedstatu famili ive alway commun undocu asian immigr challeng face apida individu person profession often struggl gender compliant pushov profession often stereotyp meek ive overlook leadership role seen perpetu foreign peopl tend us way yet put us pedest model minor look like made hesit share heritag past assumpt get map onto describ common barrier entri apida individu specif women may face tri enter advanc workplac overlook leadership past view leader peopl sometim preconceiv stereotyp asian women abl bold vocal mistaken emot believ microaggress impact apida individu workplac provid exampl microaggress erasur big say merri christma isnt inclus religion peopl often resist say happi holiday say merri christma exclud appreci heritag often microaggress micro typic aggress racial violenc term micro minim impact ive heard kind asian automat make seen american even within apida commun south asian overlook asian import represent specif apida represent organiz leadership posit want say import someon look like leadership role leader may share belief certain privileg wealth resourc lack interact lowersocioeconomicstatu asian american may caus differ commun polit think bamboo ceil accept compani work play big part polit belief align feel codeswitch ever felt necessari codeswitch like share south asian term connect other similar heritag cultur workplac welcom go sort breakout refresh make space us howev codeswitch could also mean workplac conduc welcom cultur final opinion longterm strategi creat last chang workplac ensur support equal inclus apida individu prior career financi aid lot research relat post911 immigr south asian diaspora background made heavili reli grassroot organ hire peopl want innov hire changemak hire buttonpush reduc relianc white chang becom natur organ becom organiz chang chang come us ground huge thank arbeena thapa share experi vulner us word inspir opportun understand perspect valuabl hope becom better support apida commun learn grow journey cultiv inclus growth
1,Women in tech: Panelist Spotlight – Magdalena Rahn,"Women in tech: Panelist Spotlight – Magdalena Rahn Codeup is hosting a Women in Tech Panel in honor of Women’s History Month on March 29th, 2023! To further celebrate, we’d like to spotlight each of our panelists leading up to the discussion to learn a bit about their respective experiences as women in the tech industry! Meet Magdalena! Magdalena Rahn is a current Codeup student in a Data Science cohort in San Antonio, Texas. She has a professional background in cross-cultural communications, international business development, the wine industry and journalism. After serving in the US Navy, she decided to complement her professional skill set by attending the Data Science program at Codeup; she is set to graduate in March 2023. Magdalena is fluent in French, Bulgarian, Chinese-Mandarin, Spanish and Italian. We asked Magdalena how Codeup impacted her career, and she replied “Codeup has provided a solid foundation in analytical processes, programming and data science methods, and it’s been an encouragement to have such supportive instructors and wonderful classmates.” Don’t forget to tune in on March 29th to sit in on an insightful conversation with Magdalena.",women tech panelist spotlight magdalena rahn codeup hosting women tech panel honor womens history month march 29th 2023 celebrate wed like spotlight panelists leading discussion learn bit respective experiences women tech industry meet magdalena magdalena rahn current codeup student data science cohort san antonio texas professional background crosscultural communications international business development wine industry journalism serving us navy decided complement professional skill set attending data science program codeup set graduate march 2023 magdalena fluent french bulgarian chinesemandarin spanish italian asked magdalena codeup impacted career replied codeup provided solid foundation analytical processes programming data science methods encouragement supportive instructors wonderful classmates dont forget tune march 29th sit insightful conversation magdalena,women tech panelist spotlight magdalena rahn codeup host women tech panel honor women histori month march 29th 2023 celebr wed like spotlight panelist lead discuss learn bit respect experi women tech industri meet magdalena magdalena rahn current codeup student data scienc cohort san antonio texa profession background crosscultur commun intern busi develop wine industri journal serv us navi decid complement profession skill set attend data scienc program codeup set graduat march 2023 magdalena fluent french bulgarian chinesemandarin spanish italian ask magdalena codeup impact career repli codeup provid solid foundat analyt process program data scienc method encourag support instructor wonder classmat dont forget tune march 29th sit insight convers magdalena
2,Women in tech: Panelist Spotlight – Rachel Robbins-Mayhill,"Women in tech: Panelist Spotlight – Rachel Robbins-Mayhill Codeup is hosting a Women in Tech Panel in honor of Women’s History Month on March 29th, 2023! To further celebrate, we’d like to spotlight each of our panelists leading up to the discussion to learn a bit about their respective experiences as women in the tech industry! Meet Rachel! Rachel Robbins-Mayhill is a Decision Science Analyst I in San Antonio, Texas. Rachel has had a varied career that includes counseling, teaching, training, community development, and military operations. Her focus has always been on assessing needs, identifying solutions, and educating individuals and groups on aligning needs and solutions in different contexts. Rachel’s passion for data science stems from her belief that data is a powerful tool for communicating patterns that can lead to hope and growth in the future. In June 2022, Rachel graduated from Codeup’s Innis cohort, where she honed her skills in data science. Shortly after, she started working as a Data Science Technical Writer with Apex Systems as a Contractor for USAA in July 2022. Her unconventional role allowed her to understand where her skills could be best utilized to support USAA in a non-contract role. Rachel recently joined USAA’s Data Science Delivery team as a Decision Science Analyst I in February 2023. The team is focused on delivering machine learning models for fraud prevention, and Rachel’s particular role centers around providing strategic process solutions for the team in collaboration with Operational and Model Risk components. In addition to her career, Rachel is currently pursuing a master’s degree in Applied Data Science from Syracuse University, further expanding her knowledge and skills in the field. Rachel is passionate about collaborating with individuals who share her belief in the potential of others and strive to achieve growth through logical, informed action. She welcomes LinkedIn connections and is excited about supporting the network of CodeUp alumni! We asked Rachel how Codeup impacted her career, and she replied “Codeup delivered a comprehensive education in all facets of the data science pipeline, laying a strong foundation for me to build upon. Through repeated hands-on practice, I developed a reliable process that was immediately applicable in my job. Collaborative group projects were instrumental in helping me hone my skills in project management, allowing me to navigate complex data science projects with comfortability. Thanks to this invaluable experience, I was able to make significant strides in my career within just six months of graduating from Codeup.” Don’t forget to tune in on March 29th to sit in on an insightful conversation.",women tech panelist spotlight rachel robbinsmayhill codeup hosting women tech panel honor womens history month march 29th 2023 celebrate wed like spotlight panelists leading discussion learn bit respective experiences women tech industry meet rachel rachel robbinsmayhill decision science analyst san antonio texas rachel varied career includes counseling teaching training community development military operations focus always assessing needs identifying solutions educating individuals groups aligning needs solutions different contexts rachels passion data science stems belief data powerful tool communicating patterns lead hope growth future june 2022 rachel graduated codeups innis cohort honed skills data science shortly started working data science technical writer apex systems contractor usaa july 2022 unconventional role allowed understand skills could best utilized support usaa noncontract role rachel recently joined usaas data science delivery team decision science analyst february 2023 team focused delivering machine learning models fraud prevention rachels particular role centers around providing strategic process solutions team collaboration operational model risk components addition career rachel currently pursuing masters degree applied data science syracuse university expanding knowledge skills field rachel passionate collaborating individuals share belief potential others strive achieve growth logical informed action welcomes linkedin connections excited supporting network codeup alumni asked rachel codeup impacted career replied codeup delivered comprehensive education facets data science pipeline laying strong foundation build upon repeated handson practice developed reliable process immediately applicable job collaborative group projects instrumental helping hone skills project management allowing navigate complex data science projects comfortability thanks invaluable experience able make significant strides career within six months graduating codeup dont forget tune march 29th sit insightful conversation,women tech panelist spotlight rachel robbinsmayhil codeup host women tech panel honor women histori month march 29th 2023 celebr wed like spotlight panelist lead discuss learn bit respect experi women tech industri meet rachel rachel robbinsmayhil decis scienc analyst san antonio texa rachel vari career includ counsel teach train commun develop militari oper focu alway assess need identifi solut educ individu group align need solut differ context rachel passion data scienc stem belief data power tool commun pattern lead hope growth futur june 2022 rachel graduat codeup inni cohort hone skill data scienc shortli start work data scienc technic writer apex system contractor usaa juli 2022 unconvent role allow understand skill could best util support usaa noncontract role rachel recent join usaa data scienc deliveri team decis scienc analyst februari 2023 team focus deliv machin learn model fraud prevent rachel particular role center around provid strateg process solut team collabor oper model risk compon addit career rachel current pursu master degre appli data scienc syracus univers expand knowledg skill field rachel passion collabor individu share belief potenti other strive achiev growth logic inform action welcom linkedin connect excit support network codeup alumni ask rachel codeup impact career repli codeup deliv comprehens educ facet data scienc pipelin lay strong foundat build upon repeat handson practic develop reliabl process immedi applic job collabor group project instrument help hone skill project manag allow navig complex data scienc project comfort thank invalu experi abl make signific stride career within six month graduat codeup dont forget tune march 29th sit insight convers
3,Women in Tech: Panelist Spotlight – Sarah Mellor,"Women in tech: Panelist Spotlight – Sarah Mellor Codeup is hosting a Women in Tech Panel in honor of Women’s History Month on March 29th, 2023! To further celebrate, we’d like to spotlight each of our panelists leading up to the discussion to learn a bit about their respective experiences as women in the tech industry! Meet Sarah! Sarah Mellor currently works as the Director of People Operations. She joined Codeup four and a half years ago as an Admissions Manager. She went on to build out and lead the Marketing and Admissions team, while picking up People Ops tasks and projects here and there until moving over to lead the People Ops team two years ago. Prior to Codeup, she worked at education-focused non-profits in Washington, DC and Boulder, Colorado. She graduated from Wake Forest University. We asked Sarah how Codeup has impacted her career, and her response was “I have absolutely loved having the privilege to grow alongside Codeup. In my time here across multiple different roles and departments, I’ve seen a lot of change. The consistent things have always been the high quality of passionate and hardworking people I get to work with; the impactful mission we get to work on; and the inspiring students who trust us with their career change.” Don’t forget to tune in on March 29th to sit in on an insightful conversation.",women tech panelist spotlight sarah mellor codeup hosting women tech panel honor womens history month march 29th 2023 celebrate wed like spotlight panelists leading discussion learn bit respective experiences women tech industry meet sarah sarah mellor currently works director people operations joined codeup four half years ago admissions manager went build lead marketing admissions team picking people ops tasks projects moving lead people ops team two years ago prior codeup worked educationfocused nonprofits washington dc boulder colorado graduated wake forest university asked sarah codeup impacted career response absolutely loved privilege grow alongside codeup time across multiple different roles departments ive seen lot change consistent things always high quality passionate hardworking people get work impactful mission get work inspiring students trust us career change dont forget tune march 29th sit insightful conversation,women tech panelist spotlight sarah mellor codeup host women tech panel honor women histori month march 29th 2023 celebr wed like spotlight panelist lead discuss learn bit respect experi women tech industri meet sarah sarah mellor current work director peopl oper join codeup four half year ago admiss manag went build lead market admiss team pick peopl op task project move lead peopl op team two year ago prior codeup work educationfocus nonprofit washington dc boulder colorado graduat wake forest univers ask sarah codeup impact career respons absolut love privileg grow alongsid codeup time across multipl differ role depart ive seen lot chang consist thing alway high qualiti passion hardwork peopl get work impact mission get work inspir student trust us career chang dont forget tune march 29th sit insight convers
4,Women in Tech: Panelist Spotlight – Madeleine Capper,"Women in tech: Panelist Spotlight – Madeleine Capper Codeup is hosting a Women in Tech Panel in honor of Women’s History Month on March 29th, 2023! To further celebrate, we’d like to spotlight each of our panelists leading up to the discussion to learn a bit about their respective experiences as women in the tech industry! Meet Madeleine! Madeleine Capper is a Data Scientist in San Antonio, Texas. A long-standing San Antonio resident, she studied mathematics at the University of Texas San Antonio and has worked as a Data Scientist for Booz Allen Hamilton. Madeleine currently teaches Data Science at Codeup, where she works daily with burgeoning data professionals to help them actualize their career aspirations through technical education. Madeleine attended Codeup as a student in early 2019 as a pupil in the very first Codeup Data Science cohort. The program proved immediately effective and she was the first student to obtain a data career out of the program. After working at Booz Allen Hamilton, Madeleine’s passion for education in conjunction with her appreciation for Codeup’s capacity for transformative life change brought her back to the institution in an instructional capacity, where she has been teaching for two years. Don’t forget to tune in on March 29th to sit in on an insightful conversation.",women tech panelist spotlight madeleine capper codeup hosting women tech panel honor womens history month march 29th 2023 celebrate wed like spotlight panelists leading discussion learn bit respective experiences women tech industry meet madeleine madeleine capper data scientist san antonio texas longstanding san antonio resident studied mathematics university texas san antonio worked data scientist booz allen hamilton madeleine currently teaches data science codeup works daily burgeoning data professionals help actualize career aspirations technical education madeleine attended codeup student early 2019 pupil first codeup data science cohort program proved immediately effective first student obtain data career program working booz allen hamilton madeleines passion education conjunction appreciation codeups capacity transformative life change brought back institution instructional capacity teaching two years dont forget tune march 29th sit insightful conversation,women tech panelist spotlight madelein capper codeup host women tech panel honor women histori month march 29th 2023 celebr wed like spotlight panelist lead discuss learn bit respect experi women tech industri meet madelein madelein capper data scientist san antonio texa longstand san antonio resid studi mathemat univers texa san antonio work data scientist booz allen hamilton madelein current teach data scienc codeup work daili burgeon data profession help actual career aspir technic educ madelein attend codeup student earli 2019 pupil first codeup data scienc cohort program prove immedi effect first student obtain data career program work booz allen hamilton madelein passion educ conjunct appreci codeup capac transform life chang brought back institut instruct capac teach two year dont forget tune march 29th sit insight convers
5,Black Excellence in Tech: Panelist Spotlight – Wilmarie De La Cruz Mejia,"Black excellence in tech: Panelist Spotlight – Wilmarie De La Cruz Mejia Codeup is hosting a Black Excellence in Tech Panel in honor of Black History Month on February 22, 2023! To further celebrate, we’d like to spotlight each of our panelists leading up to the discussion to learn a bit about their respective experiences as black leaders in the tech industry! Meet Wilmarie! Wilmarie De La Cruz Mejia is a current Codeup student on the path to becoming a Full-Stack Web Developer at our Dallas, TX campus. Wilmarie is a veteran expanding her knowledge of programming languages and technologies on her journey with Codeup. We asked Wilmarie to share more about her experience at Codeup. She shares, “I was able to meet other people who were passionate about coding and be in a positive learning environment.” We hope you can join us on February 22nd to sit in on an insightful conversation with Wilmarie and all of our panelists!",black excellence tech panelist spotlight wilmarie de la cruz mejia codeup hosting black excellence tech panel honor black history month february 22 2023 celebrate wed like spotlight panelists leading discussion learn bit respective experiences black leaders tech industry meet wilmarie wilmarie de la cruz mejia current codeup student path becoming fullstack web developer dallas tx campus wilmarie veteran expanding knowledge programming languages technologies journey codeup asked wilmarie share experience codeup shares able meet people passionate coding positive learning environment hope join us february 22nd sit insightful conversation wilmarie panelists,black excel tech panelist spotlight wilmari de la cruz mejia codeup host black excel tech panel honor black histori month februari 22 2023 celebr wed like spotlight panelist lead discuss learn bit respect experi black leader tech industri meet wilmari wilmari de la cruz mejia current codeup student path becom fullstack web develop dalla tx campu wilmari veteran expand knowledg program languag technolog journey codeup ask wilmari share experi codeup share abl meet peopl passion code posit learn environ hope join us februari 22nd sit insight convers wilmari panelist


In [34]:
lemmatize(codeup_df.clean[0])

'may traditionally known asian american pacific islander aapi heritage month month celebrate history contribution made possible aapi friend family community also examine level support seek opportunity better understand aapi community effort address real concern experience sat arbeena thapa one codeups financial aid enrollment manager arbeena identifies nepali american desi arbeenas parent immigrated texas 1988 better employment educational opportunity arbeenas older sister five made move u arbeena born later becoming first family u citizen codeup take effort inclusivity seriously speaking arbeena taught term aapi excludes desiamerican individual hence use term asian pacific islander desi american apida rest conversation arbeena went celebrate connect heritage cultural tradition celebrate nepal version christmas dashain nineday celebration also known dussehra grew hindu identify hindu large part heritage way connect culture include sharing food momos south asian dumpling theyre favorite

In [35]:
codeup_df['lemmatized'] = codeup_df.clean.apply(lemmatize)
codeup_df

Unnamed: 0,title,original,clean,stemmed,lemmatized
0,Spotlight on APIDA Voices: Celebrating Heritage and Inspiring Change ft. Arbeena Thapa,"May is traditionally known as Asian American and Pacific Islander (AAPI) Heritage Month. This month we celebrate the history and contributions made possible by our AAPI friends, family, and community. We also examine our level of support and seek opportunities to better understand the AAPI community. In an effort to address real concerns and experiences, we sat down with Arbeena Thapa, one of Codeup’s Financial Aid and Enrollment Managers. Arbeena identifies as Nepali American and Desi. Arbeena’s parents immigrated to Texas in 1988 for better employment and educational opportunities. Arbeena’s older sister was five when they made the move to the US. Arbeena was born later, becoming the first in her family to be a US citizen. At Codeup we take our efforts at inclusivity very seriously. After speaking with Arbeena, we were taught that the term AAPI excludes Desi-American individuals. Hence, we will now use the term Asian Pacific Islander Desi American (APIDA). Here is how the rest of our conversation with Arbeena went! How do you celebrate or connect with your heritage and cultural traditions? “I celebrate Nepal’s version of Christmas or Dashain. This is a nine-day celebration also known as Dussehra. I grew up as Hindu and I identify as Hindu, this is a very large part of my heritage. “ “Other ways I connect with my culture include sharing food! Momos are South Asian Dumplings and they’re my favorite to make and share.” “On my Asian American side, I am an advocate of immigrant justice and erasure within APIDA social or political movements. I participate in events to embrace my identity such as immigrant justice advocacy because I come from a mixed-status family. I’ve always been in a community with undocumented Asian immigrants. .” What are some of the challenges you have faced as an APIDA individual, personally or professionally? “I often struggle with being gendered as compliant or a pushover. Professionally, I am often stereotyped as meek, so I’ve been overlooked for leadership roles. We are seen as perpetually foreign; people tend to other us in that way, yet put us on a pedestal for what a model minority looks like. This has made me hesitant to share my heritage in the past because these assumptions get mapped onto me. ” Can you describe some common barriers of entry that APIDA individuals, specifically women may face when trying to enter or advance in the workplace? “Being overlooked for leadership. In the past, I have not been viewed as a leader. People sometimes have preconceived stereotypes of Asian women not being able to be bold, or being vocal can be mistaken for being too emotional. “ How do you believe microaggressions impact APIDA individuals in the workplace? Can you provide examples of such microaggressions? “Erasure is big. To me, only saying ‘Merry Christmas’ isn’t inclusive to other religions. People are often resistant to saying ‘Happy Holidays,’ but saying Merry Christmas excludes, and does not appreciate my heritage. “ “Often microaggressions are not micro at all. They typically are not aggressive racialized violence, but the term ‘micro’ minimizes impact.” “Some that I’ve heard are ‘What kind of Asian are you?’ or ‘Where are you from?’ This automatically makes me the ‘other’ and not seen as American. Even within the APIDA community, South Asians are overlooked as “Asian”.” How important is representation, specifically APIDA representation, in organizational leadership positions? “I want to say that it is important to have someone who looks like you in leadership roles, and it is, but those leaders may not share the same beliefs as you. Certain privileges such as wealth, resources, or lack of interaction with lower-socioeconomic-status Asian Americans may cause a difference in community politics. I do not think the bamboo ceiling is acceptable, but the company you work for plays a big part in your politics and belief alignment.” How do you feel about code-switching, and have you ever felt it necessary to code-switch? “I like sharing South Asian terms or connecting with others that have similar heritage and culture. A workplace that is welcoming to going into this sort of breakout is refreshing and makes space for us. However, having to code-switch could also mean a workplace that is not conducive and welcoming of other cultures. “ Finally, in your opinion, what long-term strategies can create lasting change in the workplace and ensure support, equality, and inclusion for APIDA individuals? “Prior to a career in financial aid, I did a lot of research related to the post-9/11 immigration of the South Asian diaspora. This background made me heavily rely on grassroots organizing. Hire the people that want to innovate, hire the changemakers, hire the button-pushers. Reduce reliance on whiteness as change. This will become natural for the organization and become organizational change. Change comes from us on the ground.” A huge thank you to Arbeena Thapa for sharing her experiences, and being vulnerable with us. Your words were inspiring and the opportunity to understand your perspective more has been valuable. We hope we can become better support for the APIDA community as we learn and grow on our journey of cultivating inclusive growth.",may traditionally known asian american pacific islander aapi heritage month month celebrate history contributions made possible aapi friends family community also examine level support seek opportunities better understand aapi community effort address real concerns experiences sat arbeena thapa one codeups financial aid enrollment managers arbeena identifies nepali american desi arbeenas parents immigrated texas 1988 better employment educational opportunities arbeenas older sister five made move us arbeena born later becoming first family us citizen codeup take efforts inclusivity seriously speaking arbeena taught term aapi excludes desiamerican individuals hence use term asian pacific islander desi american apida rest conversation arbeena went celebrate connect heritage cultural traditions celebrate nepals version christmas dashain nineday celebration also known dussehra grew hindu identify hindu large part heritage ways connect culture include sharing food momos south asian dumplings theyre favorite make share asian american side advocate immigrant justice erasure within apida social political movements participate events embrace identity immigrant justice advocacy come mixedstatus family ive always community undocumented asian immigrants challenges faced apida individual personally professionally often struggle gendered compliant pushover professionally often stereotyped meek ive overlooked leadership roles seen perpetually foreign people tend us way yet put us pedestal model minority looks like made hesitant share heritage past assumptions get mapped onto describe common barriers entry apida individuals specifically women may face trying enter advance workplace overlooked leadership past viewed leader people sometimes preconceived stereotypes asian women able bold vocal mistaken emotional believe microaggressions impact apida individuals workplace provide examples microaggressions erasure big saying merry christmas isnt inclusive religions people often resistant saying happy holidays saying merry christmas excludes appreciate heritage often microaggressions micro typically aggressive racialized violence term micro minimizes impact ive heard kind asian automatically makes seen american even within apida community south asians overlooked asian important representation specifically apida representation organizational leadership positions want say important someone looks like leadership roles leaders may share beliefs certain privileges wealth resources lack interaction lowersocioeconomicstatus asian americans may cause difference community politics think bamboo ceiling acceptable company work plays big part politics belief alignment feel codeswitching ever felt necessary codeswitch like sharing south asian terms connecting others similar heritage culture workplace welcoming going sort breakout refreshing makes space us however codeswitch could also mean workplace conducive welcoming cultures finally opinion longterm strategies create lasting change workplace ensure support equality inclusion apida individuals prior career financial aid lot research related post911 immigration south asian diaspora background made heavily rely grassroots organizing hire people want innovate hire changemakers hire buttonpushers reduce reliance whiteness change become natural organization become organizational change change comes us ground huge thank arbeena thapa sharing experiences vulnerable us words inspiring opportunity understand perspective valuable hope become better support apida community learn grow journey cultivating inclusive growth,may tradit known asian american pacif island aapi heritag month month celebr histori contribut made possibl aapi friend famili commun also examin level support seek opportun better understand aapi commun effort address real concern experi sat arbeena thapa one codeup financi aid enrol manag arbeena identifi nepali american desi arbeena parent immigr texa 1988 better employ educ opportun arbeena older sister five made move us arbeena born later becom first famili us citizen codeup take effort inclus serious speak arbeena taught term aapi exclud desiamerican individu henc use term asian pacif island desi american apida rest convers arbeena went celebr connect heritag cultur tradit celebr nepal version christma dashain nineday celebr also known dussehra grew hindu identifi hindu larg part heritag way connect cultur includ share food momo south asian dumpl theyr favorit make share asian american side advoc immigr justic erasur within apida social polit movement particip event embrac ident immigr justic advocaci come mixedstatu famili ive alway commun undocu asian immigr challeng face apida individu person profession often struggl gender compliant pushov profession often stereotyp meek ive overlook leadership role seen perpetu foreign peopl tend us way yet put us pedest model minor look like made hesit share heritag past assumpt get map onto describ common barrier entri apida individu specif women may face tri enter advanc workplac overlook leadership past view leader peopl sometim preconceiv stereotyp asian women abl bold vocal mistaken emot believ microaggress impact apida individu workplac provid exampl microaggress erasur big say merri christma isnt inclus religion peopl often resist say happi holiday say merri christma exclud appreci heritag often microaggress micro typic aggress racial violenc term micro minim impact ive heard kind asian automat make seen american even within apida commun south asian overlook asian import represent specif apida represent organiz leadership posit want say import someon look like leadership role leader may share belief certain privileg wealth resourc lack interact lowersocioeconomicstatu asian american may caus differ commun polit think bamboo ceil accept compani work play big part polit belief align feel codeswitch ever felt necessari codeswitch like share south asian term connect other similar heritag cultur workplac welcom go sort breakout refresh make space us howev codeswitch could also mean workplac conduc welcom cultur final opinion longterm strategi creat last chang workplac ensur support equal inclus apida individu prior career financi aid lot research relat post911 immigr south asian diaspora background made heavili reli grassroot organ hire peopl want innov hire changemak hire buttonpush reduc relianc white chang becom natur organ becom organiz chang chang come us ground huge thank arbeena thapa share experi vulner us word inspir opportun understand perspect valuabl hope becom better support apida commun learn grow journey cultiv inclus growth,may traditionally known asian american pacific islander aapi heritage month month celebrate history contribution made possible aapi friend family community also examine level support seek opportunity better understand aapi community effort address real concern experience sat arbeena thapa one codeups financial aid enrollment manager arbeena identifies nepali american desi arbeenas parent immigrated texas 1988 better employment educational opportunity arbeenas older sister five made move u arbeena born later becoming first family u citizen codeup take effort inclusivity seriously speaking arbeena taught term aapi excludes desiamerican individual hence use term asian pacific islander desi american apida rest conversation arbeena went celebrate connect heritage cultural tradition celebrate nepal version christmas dashain nineday celebration also known dussehra grew hindu identify hindu large part heritage way connect culture include sharing food momos south asian dumpling theyre favorite make share asian american side advocate immigrant justice erasure within apida social political movement participate event embrace identity immigrant justice advocacy come mixedstatus family ive always community undocumented asian immigrant challenge faced apida individual personally professionally often struggle gendered compliant pushover professionally often stereotyped meek ive overlooked leadership role seen perpetually foreign people tend u way yet put u pedestal model minority look like made hesitant share heritage past assumption get mapped onto describe common barrier entry apida individual specifically woman may face trying enter advance workplace overlooked leadership past viewed leader people sometimes preconceived stereotype asian woman able bold vocal mistaken emotional believe microaggressions impact apida individual workplace provide example microaggressions erasure big saying merry christmas isnt inclusive religion people often resistant saying happy holiday saying merry christmas excludes appreciate heritage often microaggressions micro typically aggressive racialized violence term micro minimizes impact ive heard kind asian automatically make seen american even within apida community south asian overlooked asian important representation specifically apida representation organizational leadership position want say important someone look like leadership role leader may share belief certain privilege wealth resource lack interaction lowersocioeconomicstatus asian american may cause difference community politics think bamboo ceiling acceptable company work play big part politics belief alignment feel codeswitching ever felt necessary codeswitch like sharing south asian term connecting others similar heritage culture workplace welcoming going sort breakout refreshing make space u however codeswitch could also mean workplace conducive welcoming culture finally opinion longterm strategy create lasting change workplace ensure support equality inclusion apida individual prior career financial aid lot research related post911 immigration south asian diaspora background made heavily rely grassroots organizing hire people want innovate hire changemakers hire buttonpushers reduce reliance whiteness change become natural organization become organizational change change come u ground huge thank arbeena thapa sharing experience vulnerable u word inspiring opportunity understand perspective valuable hope become better support apida community learn grow journey cultivating inclusive growth
1,Women in tech: Panelist Spotlight – Magdalena Rahn,"Women in tech: Panelist Spotlight – Magdalena Rahn Codeup is hosting a Women in Tech Panel in honor of Women’s History Month on March 29th, 2023! To further celebrate, we’d like to spotlight each of our panelists leading up to the discussion to learn a bit about their respective experiences as women in the tech industry! Meet Magdalena! Magdalena Rahn is a current Codeup student in a Data Science cohort in San Antonio, Texas. She has a professional background in cross-cultural communications, international business development, the wine industry and journalism. After serving in the US Navy, she decided to complement her professional skill set by attending the Data Science program at Codeup; she is set to graduate in March 2023. Magdalena is fluent in French, Bulgarian, Chinese-Mandarin, Spanish and Italian. We asked Magdalena how Codeup impacted her career, and she replied “Codeup has provided a solid foundation in analytical processes, programming and data science methods, and it’s been an encouragement to have such supportive instructors and wonderful classmates.” Don’t forget to tune in on March 29th to sit in on an insightful conversation with Magdalena.",women tech panelist spotlight magdalena rahn codeup hosting women tech panel honor womens history month march 29th 2023 celebrate wed like spotlight panelists leading discussion learn bit respective experiences women tech industry meet magdalena magdalena rahn current codeup student data science cohort san antonio texas professional background crosscultural communications international business development wine industry journalism serving us navy decided complement professional skill set attending data science program codeup set graduate march 2023 magdalena fluent french bulgarian chinesemandarin spanish italian asked magdalena codeup impacted career replied codeup provided solid foundation analytical processes programming data science methods encouragement supportive instructors wonderful classmates dont forget tune march 29th sit insightful conversation magdalena,women tech panelist spotlight magdalena rahn codeup host women tech panel honor women histori month march 29th 2023 celebr wed like spotlight panelist lead discuss learn bit respect experi women tech industri meet magdalena magdalena rahn current codeup student data scienc cohort san antonio texa profession background crosscultur commun intern busi develop wine industri journal serv us navi decid complement profession skill set attend data scienc program codeup set graduat march 2023 magdalena fluent french bulgarian chinesemandarin spanish italian ask magdalena codeup impact career repli codeup provid solid foundat analyt process program data scienc method encourag support instructor wonder classmat dont forget tune march 29th sit insight convers magdalena,woman tech panelist spotlight magdalena rahn codeup hosting woman tech panel honor woman history month march 29th 2023 celebrate wed like spotlight panelist leading discussion learn bit respective experience woman tech industry meet magdalena magdalena rahn current codeup student data science cohort san antonio texas professional background crosscultural communication international business development wine industry journalism serving u navy decided complement professional skill set attending data science program codeup set graduate march 2023 magdalena fluent french bulgarian chinesemandarin spanish italian asked magdalena codeup impacted career replied codeup provided solid foundation analytical process programming data science method encouragement supportive instructor wonderful classmate dont forget tune march 29th sit insightful conversation magdalena
2,Women in tech: Panelist Spotlight – Rachel Robbins-Mayhill,"Women in tech: Panelist Spotlight – Rachel Robbins-Mayhill Codeup is hosting a Women in Tech Panel in honor of Women’s History Month on March 29th, 2023! To further celebrate, we’d like to spotlight each of our panelists leading up to the discussion to learn a bit about their respective experiences as women in the tech industry! Meet Rachel! Rachel Robbins-Mayhill is a Decision Science Analyst I in San Antonio, Texas. Rachel has had a varied career that includes counseling, teaching, training, community development, and military operations. Her focus has always been on assessing needs, identifying solutions, and educating individuals and groups on aligning needs and solutions in different contexts. Rachel’s passion for data science stems from her belief that data is a powerful tool for communicating patterns that can lead to hope and growth in the future. In June 2022, Rachel graduated from Codeup’s Innis cohort, where she honed her skills in data science. Shortly after, she started working as a Data Science Technical Writer with Apex Systems as a Contractor for USAA in July 2022. Her unconventional role allowed her to understand where her skills could be best utilized to support USAA in a non-contract role. Rachel recently joined USAA’s Data Science Delivery team as a Decision Science Analyst I in February 2023. The team is focused on delivering machine learning models for fraud prevention, and Rachel’s particular role centers around providing strategic process solutions for the team in collaboration with Operational and Model Risk components. In addition to her career, Rachel is currently pursuing a master’s degree in Applied Data Science from Syracuse University, further expanding her knowledge and skills in the field. Rachel is passionate about collaborating with individuals who share her belief in the potential of others and strive to achieve growth through logical, informed action. She welcomes LinkedIn connections and is excited about supporting the network of CodeUp alumni! We asked Rachel how Codeup impacted her career, and she replied “Codeup delivered a comprehensive education in all facets of the data science pipeline, laying a strong foundation for me to build upon. Through repeated hands-on practice, I developed a reliable process that was immediately applicable in my job. Collaborative group projects were instrumental in helping me hone my skills in project management, allowing me to navigate complex data science projects with comfortability. Thanks to this invaluable experience, I was able to make significant strides in my career within just six months of graduating from Codeup.” Don’t forget to tune in on March 29th to sit in on an insightful conversation.",women tech panelist spotlight rachel robbinsmayhill codeup hosting women tech panel honor womens history month march 29th 2023 celebrate wed like spotlight panelists leading discussion learn bit respective experiences women tech industry meet rachel rachel robbinsmayhill decision science analyst san antonio texas rachel varied career includes counseling teaching training community development military operations focus always assessing needs identifying solutions educating individuals groups aligning needs solutions different contexts rachels passion data science stems belief data powerful tool communicating patterns lead hope growth future june 2022 rachel graduated codeups innis cohort honed skills data science shortly started working data science technical writer apex systems contractor usaa july 2022 unconventional role allowed understand skills could best utilized support usaa noncontract role rachel recently joined usaas data science delivery team decision science analyst february 2023 team focused delivering machine learning models fraud prevention rachels particular role centers around providing strategic process solutions team collaboration operational model risk components addition career rachel currently pursuing masters degree applied data science syracuse university expanding knowledge skills field rachel passionate collaborating individuals share belief potential others strive achieve growth logical informed action welcomes linkedin connections excited supporting network codeup alumni asked rachel codeup impacted career replied codeup delivered comprehensive education facets data science pipeline laying strong foundation build upon repeated handson practice developed reliable process immediately applicable job collaborative group projects instrumental helping hone skills project management allowing navigate complex data science projects comfortability thanks invaluable experience able make significant strides career within six months graduating codeup dont forget tune march 29th sit insightful conversation,women tech panelist spotlight rachel robbinsmayhil codeup host women tech panel honor women histori month march 29th 2023 celebr wed like spotlight panelist lead discuss learn bit respect experi women tech industri meet rachel rachel robbinsmayhil decis scienc analyst san antonio texa rachel vari career includ counsel teach train commun develop militari oper focu alway assess need identifi solut educ individu group align need solut differ context rachel passion data scienc stem belief data power tool commun pattern lead hope growth futur june 2022 rachel graduat codeup inni cohort hone skill data scienc shortli start work data scienc technic writer apex system contractor usaa juli 2022 unconvent role allow understand skill could best util support usaa noncontract role rachel recent join usaa data scienc deliveri team decis scienc analyst februari 2023 team focus deliv machin learn model fraud prevent rachel particular role center around provid strateg process solut team collabor oper model risk compon addit career rachel current pursu master degre appli data scienc syracus univers expand knowledg skill field rachel passion collabor individu share belief potenti other strive achiev growth logic inform action welcom linkedin connect excit support network codeup alumni ask rachel codeup impact career repli codeup deliv comprehens educ facet data scienc pipelin lay strong foundat build upon repeat handson practic develop reliabl process immedi applic job collabor group project instrument help hone skill project manag allow navig complex data scienc project comfort thank invalu experi abl make signific stride career within six month graduat codeup dont forget tune march 29th sit insight convers,woman tech panelist spotlight rachel robbinsmayhill codeup hosting woman tech panel honor woman history month march 29th 2023 celebrate wed like spotlight panelist leading discussion learn bit respective experience woman tech industry meet rachel rachel robbinsmayhill decision science analyst san antonio texas rachel varied career includes counseling teaching training community development military operation focus always assessing need identifying solution educating individual group aligning need solution different context rachel passion data science stem belief data powerful tool communicating pattern lead hope growth future june 2022 rachel graduated codeups innis cohort honed skill data science shortly started working data science technical writer apex system contractor usaa july 2022 unconventional role allowed understand skill could best utilized support usaa noncontract role rachel recently joined usaas data science delivery team decision science analyst february 2023 team focused delivering machine learning model fraud prevention rachel particular role center around providing strategic process solution team collaboration operational model risk component addition career rachel currently pursuing master degree applied data science syracuse university expanding knowledge skill field rachel passionate collaborating individual share belief potential others strive achieve growth logical informed action welcome linkedin connection excited supporting network codeup alumnus asked rachel codeup impacted career replied codeup delivered comprehensive education facet data science pipeline laying strong foundation build upon repeated handson practice developed reliable process immediately applicable job collaborative group project instrumental helping hone skill project management allowing navigate complex data science project comfortability thanks invaluable experience able make significant stride career within six month graduating codeup dont forget tune march 29th sit insightful conversation
3,Women in Tech: Panelist Spotlight – Sarah Mellor,"Women in tech: Panelist Spotlight – Sarah Mellor Codeup is hosting a Women in Tech Panel in honor of Women’s History Month on March 29th, 2023! To further celebrate, we’d like to spotlight each of our panelists leading up to the discussion to learn a bit about their respective experiences as women in the tech industry! Meet Sarah! Sarah Mellor currently works as the Director of People Operations. She joined Codeup four and a half years ago as an Admissions Manager. She went on to build out and lead the Marketing and Admissions team, while picking up People Ops tasks and projects here and there until moving over to lead the People Ops team two years ago. Prior to Codeup, she worked at education-focused non-profits in Washington, DC and Boulder, Colorado. She graduated from Wake Forest University. We asked Sarah how Codeup has impacted her career, and her response was “I have absolutely loved having the privilege to grow alongside Codeup. In my time here across multiple different roles and departments, I’ve seen a lot of change. The consistent things have always been the high quality of passionate and hardworking people I get to work with; the impactful mission we get to work on; and the inspiring students who trust us with their career change.” Don’t forget to tune in on March 29th to sit in on an insightful conversation.",women tech panelist spotlight sarah mellor codeup hosting women tech panel honor womens history month march 29th 2023 celebrate wed like spotlight panelists leading discussion learn bit respective experiences women tech industry meet sarah sarah mellor currently works director people operations joined codeup four half years ago admissions manager went build lead marketing admissions team picking people ops tasks projects moving lead people ops team two years ago prior codeup worked educationfocused nonprofits washington dc boulder colorado graduated wake forest university asked sarah codeup impacted career response absolutely loved privilege grow alongside codeup time across multiple different roles departments ive seen lot change consistent things always high quality passionate hardworking people get work impactful mission get work inspiring students trust us career change dont forget tune march 29th sit insightful conversation,women tech panelist spotlight sarah mellor codeup host women tech panel honor women histori month march 29th 2023 celebr wed like spotlight panelist lead discuss learn bit respect experi women tech industri meet sarah sarah mellor current work director peopl oper join codeup four half year ago admiss manag went build lead market admiss team pick peopl op task project move lead peopl op team two year ago prior codeup work educationfocus nonprofit washington dc boulder colorado graduat wake forest univers ask sarah codeup impact career respons absolut love privileg grow alongsid codeup time across multipl differ role depart ive seen lot chang consist thing alway high qualiti passion hardwork peopl get work impact mission get work inspir student trust us career chang dont forget tune march 29th sit insight convers,woman tech panelist spotlight sarah mellor codeup hosting woman tech panel honor woman history month march 29th 2023 celebrate wed like spotlight panelist leading discussion learn bit respective experience woman tech industry meet sarah sarah mellor currently work director people operation joined codeup four half year ago admission manager went build lead marketing admission team picking people ops task project moving lead people ops team two year ago prior codeup worked educationfocused nonprofit washington dc boulder colorado graduated wake forest university asked sarah codeup impacted career response absolutely loved privilege grow alongside codeup time across multiple different role department ive seen lot change consistent thing always high quality passionate hardworking people get work impactful mission get work inspiring student trust u career change dont forget tune march 29th sit insightful conversation
4,Women in Tech: Panelist Spotlight – Madeleine Capper,"Women in tech: Panelist Spotlight – Madeleine Capper Codeup is hosting a Women in Tech Panel in honor of Women’s History Month on March 29th, 2023! To further celebrate, we’d like to spotlight each of our panelists leading up to the discussion to learn a bit about their respective experiences as women in the tech industry! Meet Madeleine! Madeleine Capper is a Data Scientist in San Antonio, Texas. A long-standing San Antonio resident, she studied mathematics at the University of Texas San Antonio and has worked as a Data Scientist for Booz Allen Hamilton. Madeleine currently teaches Data Science at Codeup, where she works daily with burgeoning data professionals to help them actualize their career aspirations through technical education. Madeleine attended Codeup as a student in early 2019 as a pupil in the very first Codeup Data Science cohort. The program proved immediately effective and she was the first student to obtain a data career out of the program. After working at Booz Allen Hamilton, Madeleine’s passion for education in conjunction with her appreciation for Codeup’s capacity for transformative life change brought her back to the institution in an instructional capacity, where she has been teaching for two years. Don’t forget to tune in on March 29th to sit in on an insightful conversation.",women tech panelist spotlight madeleine capper codeup hosting women tech panel honor womens history month march 29th 2023 celebrate wed like spotlight panelists leading discussion learn bit respective experiences women tech industry meet madeleine madeleine capper data scientist san antonio texas longstanding san antonio resident studied mathematics university texas san antonio worked data scientist booz allen hamilton madeleine currently teaches data science codeup works daily burgeoning data professionals help actualize career aspirations technical education madeleine attended codeup student early 2019 pupil first codeup data science cohort program proved immediately effective first student obtain data career program working booz allen hamilton madeleines passion education conjunction appreciation codeups capacity transformative life change brought back institution instructional capacity teaching two years dont forget tune march 29th sit insightful conversation,women tech panelist spotlight madelein capper codeup host women tech panel honor women histori month march 29th 2023 celebr wed like spotlight panelist lead discuss learn bit respect experi women tech industri meet madelein madelein capper data scientist san antonio texa longstand san antonio resid studi mathemat univers texa san antonio work data scientist booz allen hamilton madelein current teach data scienc codeup work daili burgeon data profession help actual career aspir technic educ madelein attend codeup student earli 2019 pupil first codeup data scienc cohort program prove immedi effect first student obtain data career program work booz allen hamilton madelein passion educ conjunct appreci codeup capac transform life chang brought back institut instruct capac teach two year dont forget tune march 29th sit insight convers,woman tech panelist spotlight madeleine capper codeup hosting woman tech panel honor woman history month march 29th 2023 celebrate wed like spotlight panelist leading discussion learn bit respective experience woman tech industry meet madeleine madeleine capper data scientist san antonio texas longstanding san antonio resident studied mathematics university texas san antonio worked data scientist booz allen hamilton madeleine currently teach data science codeup work daily burgeoning data professional help actualize career aspiration technical education madeleine attended codeup student early 2019 pupil first codeup data science cohort program proved immediately effective first student obtain data career program working booz allen hamilton madeleines passion education conjunction appreciation codeups capacity transformative life change brought back institution instructional capacity teaching two year dont forget tune march 29th sit insightful conversation
5,Black Excellence in Tech: Panelist Spotlight – Wilmarie De La Cruz Mejia,"Black excellence in tech: Panelist Spotlight – Wilmarie De La Cruz Mejia Codeup is hosting a Black Excellence in Tech Panel in honor of Black History Month on February 22, 2023! To further celebrate, we’d like to spotlight each of our panelists leading up to the discussion to learn a bit about their respective experiences as black leaders in the tech industry! Meet Wilmarie! Wilmarie De La Cruz Mejia is a current Codeup student on the path to becoming a Full-Stack Web Developer at our Dallas, TX campus. Wilmarie is a veteran expanding her knowledge of programming languages and technologies on her journey with Codeup. We asked Wilmarie to share more about her experience at Codeup. She shares, “I was able to meet other people who were passionate about coding and be in a positive learning environment.” We hope you can join us on February 22nd to sit in on an insightful conversation with Wilmarie and all of our panelists!",black excellence tech panelist spotlight wilmarie de la cruz mejia codeup hosting black excellence tech panel honor black history month february 22 2023 celebrate wed like spotlight panelists leading discussion learn bit respective experiences black leaders tech industry meet wilmarie wilmarie de la cruz mejia current codeup student path becoming fullstack web developer dallas tx campus wilmarie veteran expanding knowledge programming languages technologies journey codeup asked wilmarie share experience codeup shares able meet people passionate coding positive learning environment hope join us february 22nd sit insightful conversation wilmarie panelists,black excel tech panelist spotlight wilmari de la cruz mejia codeup host black excel tech panel honor black histori month februari 22 2023 celebr wed like spotlight panelist lead discuss learn bit respect experi black leader tech industri meet wilmari wilmari de la cruz mejia current codeup student path becom fullstack web develop dalla tx campu wilmari veteran expand knowledg program languag technolog journey codeup ask wilmari share experi codeup share abl meet peopl passion code posit learn environ hope join us februari 22nd sit insight convers wilmari panelist,black excellence tech panelist spotlight wilmarie de la cruz mejia codeup hosting black excellence tech panel honor black history month february 22 2023 celebrate wed like spotlight panelist leading discussion learn bit respective experience black leader tech industry meet wilmarie wilmarie de la cruz mejia current codeup student path becoming fullstack web developer dallas tx campus wilmarie veteran expanding knowledge programming language technology journey codeup asked wilmarie share experience codeup share able meet people passionate coding positive learning environment hope join u february 22nd sit insightful conversation wilmarie panelist


In [37]:
def clean_df(df, extra_words=[], exclude_words=[]):
    """
    
    
    """
    

## 9. Ask yourself:
 
. If your corpus is 493KB, would you prefer to use stemmed or lemmatized text?    
. If your corpus is 25MB, would you prefer to use stemmed or lemmatized text?    
. If your corpus is 200TB of text and you're charged by the megabyte for your hosted computational resources, would you prefer to use stemmed or lemmatized text?    

. If your corpus is 493KB, would you prefer to use stemmed or lemmatized text?
- I would prefer to use lemmatized

. If your corpus is 25MB, would you prefer to use stemmed or lemmatized text?
- I would prefer to use lemmatized

. If your corpus is 200TB of text and you're charged by the megabyte for your hosted computational resources, would you prefer to use stemmed or lemmatized text?
- In this case i will use stemmed as it is faster and the gain is not much bigger.

## EXTRA

In [36]:
def basic_clean(article):
    """
    
    """
    # lowercase text
    article = article.lower()
    
    # remove any accented characters and non-ASCII characters
    # normalizing
    # getting ride of anything not in ascii
    # turning back to a string
    article = unicodedata.normalize('NFKD', article).encode('ascii','ignore').decode('utf-8')
    
    # remove special characters
    #use re.sub to remove special characters
    article = re.sub(r'[^a-z0-9\'\s]', '', article)
    
    # tokenization is the process of breaking something down into smaller, discrete units.
    # these units are called tokens.
    #create the tokenizer
    tokenize = nltk.tokenize.ToktokTokenizer()
    article = tokenize.tokenize(article, return_str=True)
    
    # Lemmatize
    # - **changes** words to their "root"
    # - it can conjugate to the base word 
    # - example: "mouse", "mice" --> "mouse"
    # - slower than stemming
    #create the lemmatizer
    wnl = nltk.stem.WordNetLemmatizer()
    
    #use lemmatize - apply stem to each word in our string
    # wnl.lemmatize(article)
    lemma = [wnl.lemmatize(word) for word in article.split()]
    
    #join words back together
    article_lemma = ' '.join(lemma)
    
    #save stopwords
    stopwords_ls = stopwords.words('english')
    
    # sort words inside stopwords
    stopwords_ls.sort()
    
    # #set a list to remove some stopwords IF THEY ARE NEEDED!
    # extra = ['all', 'about','after']
    # # remove extra words
    # set(stopwords_ls) - set(extra)

    #add to stopword list
    stopwords_ls.append("'")
    
    # #remove from stopword list
    # stopwords_ls.remove('o')
    
    #split words in lemmatized article
    words = article_lemma.split()
        
    #remove stopwords from list of words
    filtered = [word for word in words if word not in stopwords_ls]
    
    #join words back together
    parsed_article = ' '.join(filtered)
    
    return parsed_article