An e-commerce company, Store 1, has recently started collecting data about its customers. Store 1's goal is to better understand customer behavior and make data-driven decisions to improve their online experience.
<br><br>
As part of the analytical team, your first task is to assess the quality of a sample of collected data and prepare it for future analysis.

# Quiz
Store 1 aims to ensure consistency in data collection. As part of this effort, the quality of the data collected about users needs to be evaluated. You have been asked to review the collected data and propose changes. Below, you will see data about a specific user. Review the data and identify potential issues.

In [1]:
user_id = '32415'
user_name = ' mike_reed '
user_age = 32.0
fav_categories = ['ELECTRONICS', 'SPORT', 'BOOKS']

**Options:**<br>
1. The data type of `user_id` should be changed from string to integer.<br>2. The `user_name` variable contains a string with unnecessary spacing and an underscore between the first and last name.<br>3. The data type of `user_age` is incorrect.<br>4. The `fav_categories` list contains strings in uppercase. Instead, we should convert the list values to lowercase.

Write the number of options you identified as problems in the Markdown cell below. If you identified multiple problems, separate the numbers with commas. For example, if you think numbers 1 and 3 are incorrect, write 1, 3, and explain why.

**Write your answer and explain your reasoning:**<br>2, 3,<br>The user_name variable has unnecessary whitespace and an underscore that can be replaced with a space.<br>The user_age variable is typed as a float; although not incorrect, it is more appropriate to represent age as an integer (int).

# Task 1
Let's implement the changes we identified. First, we want to fix the issues with the `user_name` variable. As we checked, it has unnecessary spaces and an underscore as a separator between the first and last name. Your goal is to remove the spaces and then replace the underscore with a space.

In [2]:
user_name = ' mike_reed '
user_name = user_name.strip()
user_name = user_name.replace('_', ' ')

print(user_name)

mike reed


# Task 2
Next, we need to split the updated `user_name` into two substrings to get a list containing two values: the string for the first name and the string for the last name.

In [3]:
user_name = 'mike reed'
name_split = user_name.split()

print(name_split)

['mike', 'reed']


# Task 3
Great! Now we want to work with the 
`user_age` variable. As we mentioned before, it has an incorrect data type. Let's fix this problem by transforming the data type and printing the final result.

In [4]:
user_age = 32.0
user_age = int(user_age)

print(user_age)

32


# Task 4
As we know, data is not always perfect. We have to consider scenarios where the value of `user_age` cannot be converted to an integer. To prevent our system from crashing, we should take measures in advance.<br><br>Write code that tries to convert the `user_age` variable to an integer. If the attempt fails, display a message asking the user to provide their age as a numeric value with the message: `Please provide your age as a numeric value.`

In [None]:
user_age = 'thirty two' # this is the variable that stores the age as a string.

# code that will try to transform user_age to an integer if possible
try:
    user_age = int(user_age)
    print(user_age)
except:
    print('Forneça sua idade como um valor numérico.')


Forneça sua idade como um valor numérico.


# Tarefa 5

Por fim, observe que todas as categorias de favoritos são armazenadas em letras maiúsculas. Para preencher uma nova lista chamada `fav_categories_low` com as mesmas categorias, mas em letras minúsculas, repita os valores na lista `fav_categories`, os modifique e anexe os novos valores à lista `fav_categories_low`. Como sempre, imprima o resultado final.

In [None]:
fav_categories = ['ELECTRONICS', 'SPORT', 'BOOKS']
fav_categories_low = []

for categoria in fav_categories:
    fav_categories_low.append(categoria.lower())

print(fav_categories_low)

['electronics', 'sport', 'books']


# Task 6
We have obtained additional information about our users' spending habits, including the amount spent in each of their favorite categories. Management is interested in the following metrics:
- Total amount spent by the user
- Minimum amount spent
- Maximum amount spent

Let's calculate and print these values:

In [None]:
fav_categories_low = ['electronics', 'sport', 'books']
spendings_per_category = [894, 213, 173]

total_amount = sum(spendings_per_category)
max_amount = max(spendings_per_category)
min_amount = min(spendings_per_category)

print(total_amount)
print(max_amount)
print(min_amount)

1280
894
173


# Task 7
The company wants to offer discounts to its loyal customers. Customers who make purchases totaling over $1,500 are considered loyal and will receive a discount.<br><br>Our goal is to create a `while` loop that checks the total amount spent and stops when it is reached. To simulate new purchases, the `new_purchase` variable generates a number between 30 and 80 in each loop. This represents the amount of money spent on a new purchase, and it's what you need to add to the total.<br><br>Once the target amount is reached and the `while` loop is terminated, the final value will be printed.



In [None]:
from random import randint

total_amount_spent = 1280
target_amount = 1500

while total_amount_spent < target_amount:
	new_purchase = randint(30, 80) # we generate a random number from 30 to 80
	total_amount_spent += new_purchase
print(total_amount_spent)

1543


# Task 8
Now we have all the information about a customer the way we want it. Management has asked us to find a way to summarize all the information about a user. Your goal is to create a formatted string that uses information from the `user_id`, `user_name`, and `user_age` variables.<br><br>Here is the final string we want to create: `User 32415 is named mike and is 32 years old.`

In [None]:
user_id = '32415'
user_name = ['mike', 'reed']
user_age = 32

user_info = (f'User {user_id} is named {user_name[0]} and is {user_age} years old.')

print(user_info)

Usuário 32415 chama-se mike e tem 32 anos.


As you may already know, companies collect and store data in a specific way...
| user_id | user_name | user_age | purchase_category | spending_per_category |
| --- | --- | --- | --- | --- |
| '32415' | 'mike', 'reed' | 32 | 'electronics', 'sport', 'books' | 894, 213, 173 |
| '31980' | 'kate', 'morgan' | 24 | 'clothes', 'shoes' | 439, 390 |

In technical terms, a table is simply a nested list that contains a sublist for each user.

Store 1 created this table for its users. It is stored in the users variable. Each sublist contains the user ID, first and last name, age, favorite categories, and the amount spent in each category.

# Task 9
To calculate the company's revenue, follow these steps:<br>1. Use a `for` loop to iterate through the `users` list.<br>2. Extract the list of expenses for each user and sum the values.<br>3. Update the revenue value with the total for each user.<br><br>This will provide the company's total revenue, which you will print at the end.

In [None]:
users = [
	  # this is the beginning of the first sublist
    ['32415', ['mike', 'reed'], 32, ['electronics', 'sport', 'books'],
        [894, 213, 173]
    ], # this is the end of the first sublist

    # this is the beginning of the second sublist
    ['31980', ['kate', 'morgan'], 24, ['clothes', 'shoes'],
        [439, 390]
    ] # this is the end of the second sublist
]

revenue = 0

for user in users:
	spendings_list = user[4] # extract the list of expenses for each user and sum the values
	total_spendings =  sum(user[4]) #  sum the expenses in all categories
	revenue += total_spendings # update the revenue

print(revenue)

2109


# Task 10
Use a for loop to iterate through the list of users we provided and print the names of customers under 30 years old.

In [None]:
users = [
    ['32415', ['mike', 'reed'], 32, ['electronics', 'sport', 'books'],
     [894, 213, 173]],
    ['31980', ['kate', 'morgan'], 24, ['clothes', 'books'], [439,
     390]],
    ['32156', ['john', 'doe'], 37, ['electronics', 'home', 'food'],
     [459, 120, 99]],
    ['32761', ['samantha', 'smith'], 29, ['clothes', 'electronics',
     'beauty'], [299, 679, 85]],
    ['32984', ['david', 'white'], 41, ['books', 'home', 'sport'], [234,
     329, 243]],
    ['33001', ['emily', 'brown'], 26, ['beauty', 'home', 'food'], [213,
     659, 79]],
    ['33767', ['maria', 'garcia'], 33, ['clothes', 'food', 'beauty'],
     [499, 189, 63]],
    ['33912', ['jose', 'martinez'], 22, ['sport', 'electronics', 'home'
     ], [259, 549, 109]],
    ['34009', ['lisa', 'wilson'], 35, ['home', 'books', 'clothes'],
     [329, 189, 329]],
    ['34278', ['james', 'lee'], 28, ['beauty', 'clothes', 'electronics'
     ], [189, 299, 579]],
    ]


for user in users:
    if user[2] < 30:
        full_name = (f"{user[1][0]} {user[1][1]}")
        print(full_name)
        

kate morgan
samantha smith
emily brown
jose martinez
james lee


# Task 11
Let's combine tasks 9 and 10 and print the names of users under 30 with total spending over $1,000.

In [12]:
users = [
    ['32415', ['mike', 'reed'], 32, ['electronics', 'sport', 'books'],
     [894, 213, 173]],
    ['31980', ['kate', 'morgan'], 24, ['clothes', 'books'], [439,
     390]],
    ['32156', ['john', 'doe'], 37, ['electronics', 'home', 'food'],
     [459, 120, 99]],
    ['32761', ['samantha', 'smith'], 29, ['clothes', 'electronics',
     'beauty'], [299, 679, 85]],
    ['32984', ['david', 'white'], 41, ['books', 'home', 'sport'], [234,
     329, 243]],
    ['33001', ['emily', 'brown'], 26, ['beauty', 'home', 'food'], [213,
     659, 79]],
    ['33767', ['maria', 'garcia'], 33, ['clothes', 'food', 'beauty'],
     [499, 189, 63]],
    ['33912', ['jose', 'martinez'], 22, ['sport', 'electronics', 'home'
     ], [259, 549, 109]],
    ['34009', ['lisa', 'wilson'], 35, ['home', 'books', 'clothes'],
     [329, 189, 329]],
    ['34278', ['james', 'lee'], 28, ['beauty', 'clothes', 'electronics'
     ], [189, 299, 579]],
    ]


for user in users:
    if user[2] < 30 and sum(user[4]) > 1000:
        full_name = (f"{user[1][0]} {user[1][1]}")
        print(full_name)

samantha smith
james lee


# Task 12
Now let's print the name of all users who bought clothes.

In [None]:
users = [
    ['32415', ['mike', 'reed'], 32, ['electronics', 'sport', 'books'],
     [894, 213, 173]],
    ['31980', ['kate', 'morgan'], 24, ['clothes', 'books'], [439,
     390]],
    ['32156', ['john', 'doe'], 37, ['electronics', 'home', 'food'],
     [459, 120, 99]],
    ['32761', ['samantha', 'smith'], 29, ['clothes', 'electronics',
     'beauty'], [299, 679, 85]],
    ['32984', ['david', 'white'], 41, ['books', 'home', 'sport'], [234,
     329, 243]],
    ['33001', ['emily', 'brown'], 26, ['beauty', 'home', 'food'], [213,
     659, 79]],
    ['33767', ['maria', 'garcia'], 33, ['clothes', 'food', 'beauty'],
     [499, 189, 63]],
    ['33912', ['jose', 'martinez'], 22, ['sport', 'electronics', 'home'
     ], [259, 549, 109]],
    ['34009', ['lisa', 'wilson'], 35, ['home', 'books', 'clothes'],
     [329, 189, 329]],
    ['34278', ['james', 'lee'], 28, ['beauty', 'clothes', 'electronics'
     ], [189, 299, 579]],
    ]


for user in users:
    if 'clothes' in user[3]:  # checks if 'clothes' is in the list of categories
        full_name = f"{user[1][0]} {user[1][1]}"
        print(full_name)

kate morgan
samantha smith
maria garcia
lisa wilson
james lee
