<img src="https://www.sturgischarterschool.com/wp-content/uploads/2019/06/sturgisheader_logo.png" alt="sturgis" width="250" align="right"/>

## Computer Science Demo Notebook 1
### Sturgis Charter Public School 

#### Narrative

Demo notebooks are not assignments. Rather they are an opportunity for me to lecture or pre-record a demonstration of some particular computer science code or principles. Please note that it is not an expectation that you learn everything that is demonstrated in a demo on the first pass. Sometimes I will mix in some more advanced concepts, in order for you to see the direction we're headed. 

#### The Scenario

Imagine that we are working for a company that receives customer feedback forms, and our boss has directed us to sort these forms into various groupings. Additionally this company has three different stores, and they want to be able to see if there are trends that are different between each store. Your job is to write a function which takes these reviews and makes sense of them. 

There are three documents, one for each store, and in each document a review will inhabit a single line (that's conveniently simple). Each review will be a narrative sentence, followed by a series of word number pairs. It will look something like this. 

> "This store is awesome I love that I can buy anything I want. Great staff." staff: 5/5, cleanliness 3/5, selection: 4/5

Now the first thing to note is that the simple data structures of int, str and the like are not going to get us where we want. The first thing to think about is how do we want to break this down. 

In [None]:
#The first thing we're going to need to do is open 
file = open("store1.txt", "r")
s1 = file.read()
file.close()

file = open("store2.txt", "r")
s2 = file.read()
file.close()

file = open("store3.txt", "r")
s3 = file.read()
file.close()

In [None]:
#We can check to see if this did what we wanted by printing out the variables s1, s2, s3
print(s1, '\n', s2, '\n', s3)

#### Reformatting data

One of the key things to remember is that a variable is happy to change at any point. We can also create new variables. Right now I have three strings, but that's not very useful to me. I can't do much with that. Let's try and split these into meaningful chunks. 

One thing to keep in mind. I'm going to teach you the simplest way to do this, which in reality means a more difficult way. We'll talk about imports at a future point. 

So let's first ask ourselves this question: How do we know when we want to break up the string? Let's take a look at just one of the stores. 

In [None]:
print(repr(s1))
#pay close attention to \n. What do you think that's doing there?

In [None]:
#Let's see if we can break this up. 
reviews = [] #We'll make an empty list. lists are great for holding a series of data that we can then manipulate.
rev = ''
for character in s1:
    if character == '\n':
        reviews.append(rev)
        rev = ''
    else:
        rev = rev + character
        
#Now let's make sure we got everything
print(len(reviews)) #this is asking me to print the length of my list 'reviews'

In [None]:
#And we could call each str from the list one at a time. Let's number them for fun. 
for i, line in enumerate(reviews):
    print(i, line)

#### Slices

Now, can you imagine what I want to do next? Let's split the text review from the score reviews. 

Let's do something a little different here. Instead of using the quotes, let's use the fact that the second half

>  staff: 5/5, cleanliness 3/5, selection: 4/5

Let's see how many characters that is. 

`print(len(" staff: 5/5, cleanliness 3/5, selection: 4/5"))`

`44`

In [None]:
for line in reviews:
    print(line[:-44])
    print(line[-44:])

In [None]:
#But what I really want is to build a new data structure that has both of these pieces, that I could manipulate. 
#Let's run through it one more time. 
refined = []
for line in reviews:
    refined.append((line[:-44], line[-44:])) #note the extra set of parenthesis

In [None]:
#Now I could access just one or the other. 
for line in refined:
    print(line[1])

In [None]:
# Now, let's see if we can find out how this store did overall. 
# To do this, I'm going to use a dictionary. This dictionary is going to have three keys, and each key must have a value.
# First we'll make our dictionary.
dictscores = {'staff': 0, 'cleanliness': 0, 'selection': 0}
print(dictscores)

In [None]:
#Now it's just a matter of collecting the numbers

#Let's work on the fact that we know each line is going to have only 6 numbers, and 1, 3 and 5 are the ones we want.
matches = ['1', '2', '3', '4' ,'5']
scores = []
for line in refined:
    for character in line[1]:
        if character in matches:
            scores.append(character)
print(scores)

In [None]:
#Now it's time to extract the numbers we want. 
# let's use the modulus to find the odd numbers
rscore = []
for i, score in enumerate(scores):
    if i % 2 == 1:
        pass
    else:
        print(i, score)
        rscore.append(score)

In [None]:
#Finally we can simply cycle through the three and add them to our dictionary. 
dictscores = {'staff': 0, 'cleanliness': 0, 'selection': 0}

key = 'staff'
for score in rscore:
    if key == 'staff':
        dictscores[key] += int(score)
        key = 'cleanliness'
    elif key == 'cleanliness':
        dictscores[key] += int(score)
        key = 'selection'
    elif key == 'selection':
        dictscores[key] += int(score)
        key = 'staff'
print(dictscores)

In [None]:
#And we'll stop there for the time being. Thanks for participating in this demo!
#Now if you want to practice. Try doing the comparison. Which of the three stores has the best overall score?