## Introduction

![](https://download.logo.wine/logo/ASOS_(retailer)/ASOS_(retailer)-Logo.wine.png)

It's your first day on the job and you're given data about the weekly revenue at ASOS.

In the list named `revenue_by_week`, you'll find a snippet of ASOS's estimated weekly revenue, captured in millons of pounds.

But, ugh! There seems to be some bugs with how the data was stored in the list 😭

In [None]:
revenue_by_week = [65, 77, '66', '74', 
                   64, 82, '86', 72, '80',
                   96, 101, '35', '72', '68',
                  ]

sum(revenue_by_week)

TypeError: ignored

Use the skills you learned about lists, functions, and loops to get this data sorted out.

## Task 1: Fixing The Data

![](https://media.giphy.com/media/QsnQsvAkrkKGiHZuo8/giphy.gif)

Take a closer look at that error from the cell above, coming from running the `sum()` function. 

It's a TypeError. Python is basically saying that it doesn't know how to add (`+`) a number (`int`) and string (`str`) together. 

So now take a closer look at the `revenue_by_week` list. You can see that some of the numbers are there as numbers (without the surrounding `''` quotes), and some of them are strings (with surrounding `''` quotes).

While you could go into the list and manually remove all of the `'` marks, that sounds like a pain, and imagine if you had a lot more numbers than the three months of ASOS data. You can use your programming skills so that Python does the manual work for you! There's a built-in function, `int()`, that changes the argument given to it into a number. **Run the cell below** to see a demonstration on some various data types.

**Note**: The last line in the cell will give an error since a `list` is not a number (the individual elements can be, though!)

In [None]:
print(int(105))   # integer
print(int(52.80)) # decimal (float)
print(int('97'))  # integer string
print(int([105, '97'])) # list

105
52
97


TypeError: ignored

Notice that the `int()` function can accept values that are already numbers (chopping off any decimal part if present) or strings that depict integers (it will fail on decimal strings, however). But you should also notice that Python threw an error when we tried to give it a list.

In order to clean the data, we need to do that item by item. And how do we go item by item? Yes! We use a loop!

![](https://media.giphy.com/media/ryQDjtPSPfgeA/giphy.gif)

To complete this task, create a new list `cleaned_revenue` that has all of the data values in a numeric data type:
- Set up `cleaned_revenue` as an empty list.
- Use a `for` loop to loop over the elements of `revenue_by_week`.
  - For each element, convert it to an integer data type with the `int()` function
  - Append the converted value to the `cleaned_revenue` list.

Outside of your loop, `print` the completed `cleaned_revenue` and `print` the total sum of values.

In [None]:
# set up storage for cleaned data
cleaned_revenue = []


# loop through data and convert to integers
for revenue in revenue_by_week:
  cleaned_revenue.append(int(revenue))


# assess the cleaned data by printing it
print(cleaned_revenue)
print(sum(cleaned_revenue))

[65, 77, 66, 74, 64, 82, 86, 72, 80, 96, 101, 35, 72, 68]
1038


## Task 2: Slicing and Dicing

![](https://media.giphy.com/media/vsuBq0HLMUw6nTqLWS/giphy.gif)

Your friend is so pleased with your work that they want you to do some additional calculations. 

- **What was the total amount made in each month?**
- **Which month had the highest average (weekly) revenue?**

To answer these questions, you should know that the months are divided so that the first four weeks were June, the middle five weeks were July (5th-9th entries in list), and the last five weeks were August (10th-14th entries in list). 

Use slicing to get the relevant parts of the original revenue list, then use the `sum()` and `len()` functions to help you calculate the total and average for each month: the average will be the total revenue divided by the number of weeks. 

***Tip***: Be careful about how indexing works in Python! You might want to try printing the slices you pull out first to check that they're capturing the correct values, before trying to summarize them.

In [None]:
# revenue by month
june_revenue = cleaned_revenue[:4] # cleaned_revenue[0:4] will also work
july_revenue = cleaned_revenue[4:9]
august_revenue = cleaned_revenue[9:] # cleaned_revenue[-5:] will also work if you want to get fancy.

# calculate sum
june_total = sum(june_revenue)
july_total = sum(july_revenue)
august_total = sum(august_revenue)

# calculate avg
june_avg = sum(june_revenue) / len(june_revenue)
july_avg = sum(july_revenue) / len(july_revenue)
august_avg = sum(august_revenue) / len(august_revenue)

# print the total amount and average revenue for each month
print(f"June: total = {june_total}, weekly average = {june_avg}")
print(f"July: total = {july_total}, weekly average = {july_avg}")
print(f"August: total = {august_total}, weekly average = {august_avg}")


June: total = 282, weekly average = 70.5
July: total = 384, weekly average = 76.8
August: total = 372, weekly average = 74.4


(Double-click this cell and add, below this line, which month had the highest average revenue.)

> 