# Coercing Strings Lab

### Introduction

In this lesson, let's practice working with string data in Python.  We'll do so by looking at the Yelp reviews for Max's Wine Bar in Texas.

### Loading The Data

We start by loading up the data, and then selecting the text from the reviews.

In [1]:
import pandas as pd
url = "https://raw.githubusercontent.com/eng-6-22/mod-1-a-data-structures/master/5-coercing-strings/updated-reviews.json"
df = pd.read_json(url)

In [2]:
reviews = list(df.text)

In [3]:
reviews[:2]

['Chicken and waffles are bombomski!!!! So yum 100% recommend. Outdoor area is good! Will be back.',
 "This place holds great memories for me and my family.... \xa0It was one of the last outtings with my mother.... Now it's closed down! The food was really good and it was a great ambience with the diverse crowd that was out. \xa0We had a table for about 20 or so for our guest. Our waitress did not miss a beat! \xa0She was able to get everyone's order and come back to make sure we were taken care of. \xa0Top Notch! \xa0Max hit the spot of providing a little of something for everyone from the menu. \xa0The cocktails were also really good with a nice kick! The comfort food hit the spot for me.... \xa0I was really loving the greens and fried chicken. Now when I added the waffles to the dish it makes the perfect combination. They had no problem with splitting the bill with such a large crowd. Everyone had a nice time and enjoyed the occasion as well as food...."]

### Finding and Counting

Let's start by practicing with the `find` method in Python.  We'll start by selecting the seventh review. 

In [4]:
seventh_review = reviews[6]
seventh_review[:30]

'The friend chicken here is del'

Now imagine that we want to write a method that selects characters before and after the word chicken to see what it is often associated with.

> Begin by finding the index of the string in the `seventh_review` where the word `'chicken'` begins.

In [5]:
char_chicken = seventh_review.find('chicken')
# 11

Next, write a method that selects ten characters before the word chicken and after the word chicken.  

In [6]:
def select_text(review):
    idx = review.find('chicken')
    return review[idx-10:idx+10+7]

In [7]:
select_text(seventh_review)
# 'he friend chicken here is d'

'he friend chicken here is d'

Now write a method that can take any string and word as an argument and return ten characters before and after that word.

In [8]:
def select_text_around(review, word):
    idx = review.find(word)
    return review[idx-10:idx+10+len(word)]

In [9]:
select_text_around(seventh_review, 'fried')
# 'e classic fried chicken, '

'e classic fried chicken, '

2. Using Count

Next, let's write a function that counts up the number of exclamation points in a review.

> Do so with the `count` method.

In [10]:
def count_exclamations(review):
    return review.count('!')

In [11]:
first_review = reviews[0]
count_exclamations(first_review)
# 5

5

There are other ways to do this.  For example, note that we can iterate through the characters of a string.

> See this by uncommenting and executing the cell below.

In [12]:
# [char for char in first_review]

So use list comprehension to count up the number of exclamation points in a string.

In [13]:
def count_exclamations(review):
    return sum([1 for c in review if c == '!'])

In [14]:
def count_exclamations(review):
    return len([c for c in reviews if c == '!'])

In [15]:
count_exclamations(first_review)

0

### Back to Split

Let's return to our method where we found the text that surrounds the word  `chicken` using the find method.  Now another way that we could do this is by splitting the review by word, and then selecting the words before and after the specified word in a review.  

> To do this, we'll need a way to find the `index` where our word is located, so look at:

* `enumerate` in Python
* using `range()` to iterate through a list

In [16]:
def review_word(review, selected_word):
    results = []
    split_review = review.split(' ')

    for i, word in enumerate(split_review):
    if word == selected_word:
        results.append(' '.join([split_review[i-1], split_review[i], split_review[i+1]]))
    return results

In [18]:
chosen_review

"Service was slow even though the place wasn't crowded. I ordered the chicken and waffles. The waffle was cold with no butter the chicken was good but I had to ask for syrup. I think the dish was over priced as well 18$ for a waffle and three chicken wings. Luckily I got a Groupon deal on the meal the total for our table would have been 65$ without the deal. The only way I would eat here again is through Groupon def not worth full price."

In [17]:
chosen_review = reviews[13]
review_word(chosen_review, "was")
# ['Service was slow',
#  'The waffle was cold',
#  'the chicken was good',
#  'the dish was over']

['Service was slow', 'waffle was cold', 'chicken was good', 'dish was over']

> The above is hard so please split it into steps and try to make progress.