# The purpose of this exercise is to show performance by designated market area - commonly referred to as DMA
Overview
The sales_archive.db contains four tables: dmas, sales, transactions, visits. The purpose of this exercise is to show performance by designated market area 

Questions 
1. What’s the total sales amount by designated market area (DMA)? List the name (not the id) and total sales amount for each DMA.

2. Average Order Value (AOV) is defined as sales.amount divided by transactions.transaction_count. List the name (not the id) and average order value for each DMA on January 1st, 2019. Order the result set from highest average order value to lowest.

3. For each DMA, calculate the average, lowest, and highest sales.amount for the month of January 2019. List the name (not the id), average sales amount, minimum sales amount, and maximum sales amount for each DMA.

4. Seems like something may be wrong with the data in our visits table. For the month of February 2019, list the name (not the id) and the frequency (count of occurrences where the condition is true) where a DMA’s visit count is greater than the transaction count. Order by the DMA name.

In [1]:
import pandas as pd
from sqlalchemy import create_engine 

con = create_engine("sqlite:///sales_archive.db")
con.table_names()

['dmas', 'sales', 'transactions', 'visits']

In [2]:
# 1. What’s the total sales amount by designated market area (DMA)? 
pd.read_sql_query("SELECT SUM(amount) FROM sales" , con)

Unnamed: 0,SUM(amount)
0,103923934


In [3]:
pd.read_sql_query("SELECT * FROM dmas" , con).head()

Unnamed: 0,id,name,state
0,501,NEW YORK,NY
1,502,BINGHAMTON,NY
2,506,BOSTON,MA
3,514,BUFFALO,NY
4,521,PROVIDENCE-NEW BEDFORD,MA


In [4]:
# 1. List the name (not the id) and total sales amount for each DMA. -->
pd.read_sql_query("select name,sum(amount) from dmas inner join sales on dmas.id = sales.dma_id group by name order by sum(amount)", con).head()

Unnamed: 0,name,sum(amount)
0,BAKERSFIELD,2444894
1,PADUCAH-CAPE GIRARDEAU-HARRISBURG,2476432
2,WATERTOWN,2528317
3,SAINT LOUIS,2559709
4,QUINCY-HANNIBAL-KEOKUK,2559941


In [7]:
# 2. Average Order Value (AOV) is defined as sales.amount divided by transactions.transaction_count.
# 2 List the name (not the id) and total sales amount for each DMA. --> List the name (not the id) and average order value for each DMA on January 1st, 2019. Order the result set from highest average order value to lowest.
query2 = '''select b.name, sum_amount/transaction_count as aov
from  (
	select date,dmas.id,name,sum(amount) sum_amount 
	from dmas 
	inner join sales 
		on dmas.id = sales.dma_id  
	group by date,dmas.id,name
) b
join transactions t 
	on b.id = t.dma_id and t.date = b.date
where t.date = '2019-01-01'
group by b.name,sum_amount/transaction_count
order by sum_amount/transaction_count desc
'''



pd.read_sql_query(query2, con)

Unnamed: 0,name,aov
0,WATERTOWN,2401
1,PADUCAH-CAPE GIRARDEAU-HARRISBURG,2302
2,LOS ANGELES,1803
3,DAVENPORT-ROCK ISLAND-MOLINE,1288
4,MEDFORD-KLAMATH FALLS,1036
5,MONTEREY-SALINAS,1019
6,BURLINGTON-PLATTSBURGH,825
7,NEW YORK,727
8,PROVIDENCE-NEW BEDFORD,517
9,EUREKA,346


3. For each DMA, calculate the average, lowest, and highest sales.amount for the month of January 2019. List the name (not the id), average sales amount, minimum sales amount, and maximum sales amount for each DMA.

In [65]:
query3 = '''select name, avg(amount),min(amount),max(amount)
from sales s
join dmas d
	on d.id = s.dma_id
where date < '2019-02-01' --month of January 2019. 
group by name
'''
pd.read_sql_query(query3, con)

Unnamed: 0,name,avg(amount),min(amount),max(amount)
0,ALBANY-SCHENECTADY-TROY,55127.645161,1960,97518
1,BAKERSFIELD,44766.419355,289,97441
2,BINGHAMTON,57447.290323,803,98647
3,BOSTON,48416.774194,1217,94740
4,BUFFALO,50881.258065,513,96803
5,BURLINGTON-PLATTSBURGH,50320.064516,1287,96895
6,CHAMPAIGN-SPRINGFIELD-DECATUR,47593.645161,3182,91906
7,CHICAGO,48709.096774,5054,95937
8,CHICO-REDDING,54180.709677,10645,92845
9,DAVENPORT-ROCK ISLAND-MOLINE,51335.903226,223,99563


4. Seems like something may be wrong with the data in our visits table. 
For the month of February 2019, list the name (not the id) and the frequency (count of occurrences where the condition is true) where a DMA’s visit count is greater than the transaction count. Order by the DMA name.

In [10]:
query4 = '''

select name, count(*) as frequency
from visits  v
join transactions t
	on v.dma_id = t.dma_id  AND v.date =t.date
    
join dmas d
	on d.id = t.dma_id
    
where v.date between  '2019-02-01' and '2019-03-01' and visit_count > transaction_count 
group by name 
order by name 

--where a DMA’s visit count is greater than the transaction count
'''

pd.read_sql_query(query4, con)

Unnamed: 0,name,frequency
0,BAKERSFIELD,1
1,BOSTON,3
2,BUFFALO,1
3,BURLINGTON-PLATTSBURGH,2
4,CHAMPAIGN-SPRINGFIELD-DECATUR,1
5,CHICAGO,2
6,DAVENPORT-ROCK ISLAND-MOLINE,2
7,ELMIRA,1
8,EUREKA,2
9,EVANSVILLE,3


FizzBuzz

Write a function that - given two parameters: x and n - 
returns a list of strings and integers for a given range

- for multiples of three append "Fizz" instead of the number
- for multiples of five append "Buzz"
- for numbers which are multiples of both three and five append "FizzBuzz"
- for all other numbers, append the number itself

In [72]:
def fizz_buzz(x: int, n: int) -> list:
    selected = []
    for i in range(x, n):
      if i%3 == 0 and i%5 != 0:
          selected.append("Fizz")
      elif i%5 == 0 and i%3 != 0:
          selected.append("Buzz")
      elif i%3 == 0 and i%5 == 0:
          selected.append("FizzBuzz")      
      else: 
          selected.append(i)
    
    return selected

In [73]:
# if __name__ == "__main__":
A = fizz_buzz(0, 20)
print(A)

['FizzBuzz', 1, 2, 'Fizz', 4, 'Buzz', 'Fizz', 7, 8, 'Fizz', 'Buzz', 11, 'Fizz', 13, 14, 'FizzBuzz', 16, 17, 'Fizz', 19]
