### Prompt

Financial analysts often need to consolidate data from various sources. This week's challenge focuses on using Alteryx functions in financial planning and analysis (FP&A) projects.

The first dataset   includes sales information for various countries and business segments for the years 2022 and 2023, with the sales figures recorded in each country’s local currency. The second dataset provides the conversion rates from the countries’ currencies to   euros.

Your first task is to create a final table that includes the segment, region, and total sales   in euros for both 2022 and 2023  . (Tip: Remove any columns where the sale price   is null.)

For the second part of the challenge, calculate the year-over-year variation per segment (2022–2023) and identify the segments that experienced negative variations across all three regions: Europe, Asia Pacific (APAC), and the USA.

In [1]:
# importing library

import pandas as pd

import os 

import numpy as np

In [56]:
# importing data 

cwd = os.getcwd()

file_names = ['exchange_rates.csv','sales_figures.csv']

# creating function to join file paths

file_path_list = []

for x in file_names:
    
  file_path_list.append(os.path.join(cwd, x))

In [57]:
file_path_list

['C:\\Users\\CharlesYi\\Jupyter Notebook\\Alteryx Challenges\\Challenge 449_Global Sales Analysis\\exchange_rates.csv',
 'C:\\Users\\CharlesYi\\Jupyter Notebook\\Alteryx Challenges\\Challenge 449_Global Sales Analysis\\sales_figures.csv']

In [58]:
# creating function to import both data

dfs = {}

for x in file_path_list:
    
    file_name = os.path.basename(x)
    
    dfs[file_name] = pd.read_csv(x)

In [115]:
dfs

{'exchange_rates.csv':    Currency     EUR     USD     JPY     GBP     CHF     CAD     MAD    CNY  \
 0       EUR  1.0000  1.0700  158.50  0.8600  0.9800  1.4700  11.240  7.460   
 1       USD  0.9300  1.0000  148.00  0.8000  0.9200  1.3700  10.500  6.970   
 2       JPY  0.0063  0.0068    1.00  0.0054  0.0062  0.0093   0.071  0.047   
 3       GBP  1.1700  1.2500  185.00  1.0000  1.1400  1.7100  13.160  8.730   
 4       CHF  1.0200  1.0900  161.00  0.8800  1.0000  1.5000  11.550  7.660   
 5       CAD  0.6800  0.7300  108.30  0.5800  0.6700  1.0000   7.700  5.100   
 6       MAD  0.0890  0.0950   14.05  0.0750  0.0870  0.1300   1.000  0.660   
 7       CNY  0.1300  0.1400   21.60  0.1100  0.1300  0.2000   1.520  1.000   
 8       THB  0.0260  0.0270    4.10  0.0220  0.0250  0.0370   0.290  0.190   
 9       ILS  0.2400  0.2600   37.80  0.2100  0.2400  0.3500   2.620  1.790   
 10      RUB  0.0098  0.0110    1.56  0.0084  0.0096  0.0140   0.110  0.073   
 
       THB    ILS     RUB  


In [59]:
df_exchange_rate = dfs['exchange_rates.csv']

df_exchange_rate.head()

Unnamed: 0,Currency,EUR,USD,JPY,GBP,CHF,CAD,MAD,CNY,THB,ILS,RUB
0,EUR,1.0,1.07,158.5,0.86,0.98,1.47,11.24,7.46,39.23,4.18,102.04
1,USD,0.93,1.0,148.0,0.8,0.92,1.37,10.5,6.97,36.66,3.89,95.0
2,JPY,0.0063,0.0068,1.0,0.0054,0.0062,0.0093,0.071,0.047,0.25,0.026,0.64
3,GBP,1.17,1.25,185.0,1.0,1.14,1.71,13.16,8.73,45.74,4.85,118.8
4,CHF,1.02,1.09,161.0,0.88,1.0,1.5,11.55,7.66,40.1,4.25,104.0


In [60]:
df_sales_figures = dfs['sales_figures.csv']

df_sales_figures.head(10)

Unnamed: 0,Region,Date,Segment,Country,Currency,Product,Discount Band,Units Sold,Manufacturing Price,Sale Price,Gross Sales,Discounts,Sales,COGS,Profit
0,APAC,2022-02-01,Small Business,Japan,JPY,VTT,Low,1778.0,260.0,350.0,622300.0,24892.0,597408.0,462280.0,135128.0
1,APAC,2022-02-01,Enterprise,China,CNY,Velo,Medium,1802.0,10.0,20.0,36040.0,1802.0,34238.0,18020.0,16218.0
2,APAC,2022-02-01,Enterprise,Japan,JPY,Velo,Medium,2436.0,250.0,300.0,730800.0,43848.0,686952.0,609000.0,77952.0
3,APAC,2022-02-01,Government,China,CNY,Montana,Medium,1421.0,120.0,20.0,28420.0,1989.4,26430.6,14210.0,12220.6
4,APAC,2022-02-01,Government,China,CNY,Amarilla,High,808.0,250.0,300.0,242400.0,19392.0,223008.0,202000.0,21008.0
5,APAC,2022-02-01,Channel Partners,China,CNY,Carretera,High,1611.0,5.0,7.0,11277.0,1014.93,10262.07,8055.0,2207.07
6,APAC,2022-02-01,Government,Japan,JPY,Amarilla,High,1916.0,120.0,125.0,239500.0,23950.0,215550.0,229920.0,-14370.0
7,APAC,2022-02-01,Government,Japan,JPY,Montana,High,2015.0,260.0,12.0,24180.0,3385.2,20794.8,6045.0,14749.8
8,APAC,2022-02-01,Government,Japan,JPY,Paseo,High,2438.0,120.0,125.0,304750.0,45712.5,259037.5,292560.0,-33522.5
9,APAC,2022-03-01,Government,China,CNY,VTT,Low,727.0,10.0,350.0,254450.0,15267.0,239183.0,189020.0,50163.0


### Task 1

In [61]:
# transposing exchange rates

df_exchange_transposed = pd.melt(df_exchange_rate, id_vars = 'Currency')

# filtering to variable EUR

currency_lookup = df_exchange_transposed[df_exchange_transposed['variable'] == 'EUR']

currency_lookup = currency_lookup[['Currency', 'value']]

currency_lookup

Unnamed: 0,Currency,value
0,EUR,1.0
1,USD,0.93
2,JPY,0.0063
3,GBP,1.17
4,CHF,1.02
5,CAD,0.68
6,MAD,0.089
7,CNY,0.13
8,THB,0.026
9,ILS,0.24


In [62]:
# joining currency lookup to data

df_merged = df_sales_figures.merge(currency_lookup, on = 'Currency')

df_merged

Unnamed: 0,Region,Date,Segment,Country,Currency,Product,Discount Band,Units Sold,Manufacturing Price,Sale Price,Gross Sales,Discounts,Sales,COGS,Profit,value
0,APAC,2022-02-01,Small Business,Japan,JPY,VTT,Low,1778.0,260.0,350.0,622300.0,24892.00,597408.0,462280.0,135128.00,0.0063
1,APAC,2022-02-01,Enterprise,Japan,JPY,Velo,Medium,2436.0,250.0,300.0,730800.0,43848.00,686952.0,609000.0,77952.00,0.0063
2,APAC,2022-02-01,Government,Japan,JPY,Amarilla,High,1916.0,120.0,125.0,239500.0,23950.00,215550.0,229920.0,-14370.00,0.0063
3,APAC,2022-02-01,Government,Japan,JPY,Montana,High,2015.0,260.0,12.0,24180.0,3385.20,20794.8,6045.0,14749.80,0.0063
4,APAC,2022-02-01,Government,Japan,JPY,Paseo,High,2438.0,120.0,125.0,304750.0,45712.50,259037.5,292560.0,-33522.50,0.0063
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2266,USA,2023-12-01,Government,USA,USD,VTT,Low,736.0,120.0,20.0,14720.0,588.80,14131.2,7360.0,6771.20,0.9300
2267,USA,2023-12-01,Government,USA,USD,Velo,Medium,1283.0,5.0,300.0,384900.0,30792.00,354108.0,320750.0,33358.00,0.9300
2268,USA,2023-12-01,Small Business,USA,USD,VTT,Medium,380.0,10.0,7.0,2660.0,292.60,2367.4,1900.0,467.40,0.9300
2269,USA,2023-12-01,Channel Partners,USA,USD,Amarilla,High,2761.0,260.0,1200.0,33132.0,3975.84,3313200.0,8283.0,20873.16,0.9300


In [63]:
# creating function to convert currencies

def currency_converter(df):
    
    cols = df.columns
    
    for col in cols:
        
        if pd.api.types.is_numeric_dtype(df[col]):
            df[col] = df[col] *  df['value']
        else:
            df[col] = df[col] 
    
    return df
    

In [64]:
# applying function

df_converted = currency_converter(df_merged)

df_converted.head()

Unnamed: 0,Region,Date,Segment,Country,Currency,Product,Discount Band,Units Sold,Manufacturing Price,Sale Price,Gross Sales,Discounts,Sales,COGS,Profit,value
0,APAC,2022-02-01,Small Business,Japan,JPY,VTT,Low,11.2014,1.638,2.205,3920.49,156.8196,3763.6704,2912.364,851.3064,4e-05
1,APAC,2022-02-01,Enterprise,Japan,JPY,Velo,Medium,15.3468,1.575,1.89,4604.04,276.2424,4327.7976,3836.7,491.0976,4e-05
2,APAC,2022-02-01,Government,Japan,JPY,Amarilla,High,12.0708,0.756,0.7875,1508.85,150.885,1357.965,1448.496,-90.531,4e-05
3,APAC,2022-02-01,Government,Japan,JPY,Montana,High,12.6945,1.638,0.0756,152.334,21.32676,131.00724,38.0835,92.92374,4e-05
4,APAC,2022-02-01,Government,Japan,JPY,Paseo,High,15.3594,0.756,0.7875,1919.925,287.98875,1631.93625,1843.128,-211.19175,4e-05


In [65]:
# checking date type

df_converted['Date'].dtype

dtype('O')

In [66]:
# converting to datetime

df_converted['Date'] = pd.to_datetime(df_converted['Date'])

df_converted['Date'].dtype # checking

dtype('<M8[ns]')

In [67]:
# extracting year from date

df_converted['year'] = df_converted['Date'].dt.year

df_converted.head()

Unnamed: 0,Region,Date,Segment,Country,Currency,Product,Discount Band,Units Sold,Manufacturing Price,Sale Price,Gross Sales,Discounts,Sales,COGS,Profit,value,year
0,APAC,2022-02-01,Small Business,Japan,JPY,VTT,Low,11.2014,1.638,2.205,3920.49,156.8196,3763.6704,2912.364,851.3064,4e-05,2022
1,APAC,2022-02-01,Enterprise,Japan,JPY,Velo,Medium,15.3468,1.575,1.89,4604.04,276.2424,4327.7976,3836.7,491.0976,4e-05,2022
2,APAC,2022-02-01,Government,Japan,JPY,Amarilla,High,12.0708,0.756,0.7875,1508.85,150.885,1357.965,1448.496,-90.531,4e-05,2022
3,APAC,2022-02-01,Government,Japan,JPY,Montana,High,12.6945,1.638,0.0756,152.334,21.32676,131.00724,38.0835,92.92374,4e-05,2022
4,APAC,2022-02-01,Government,Japan,JPY,Paseo,High,15.3594,0.756,0.7875,1919.925,287.98875,1631.93625,1843.128,-211.19175,4e-05,2022


In [84]:
# aggregating total sales by region, segment, and year

df_sales = df_converted.groupby(['Region', 'Segment', 'year']).agg(total_sales = ('Sales', 'sum'))

df_sales.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,total_sales
Region,Segment,year,Unnamed: 3_level_1
APAC,Channel Partners,2022,296720.122271
APAC,Channel Partners,2023,354735.434574
APAC,Enterprise,2022,274177.242683
APAC,Enterprise,2023,259736.397979
APAC,Government,2022,840175.462485


In [86]:
# resetting index

df_sales.reset_index(inplace=True)

In [74]:
# pivoting year

df_sales_pivot = pd.pivot(df_sales, index = ['Region', 'Segment'], columns = 'year', values = 'total_sales')

df_sales_pivot.reset_index(inplace = True)

df_sales_pivot.head()

year,Region,Segment,2022,2023
0,APAC,Channel Partners,296720.122271,354735.4
1,APAC,Enterprise,274177.242683,259736.4
2,APAC,Government,840175.462485,1089334.0
3,APAC,Midmarket,617042.843926,563486.1
4,APAC,Small Business,372555.686345,377784.0


### Task 2

In [87]:
df_sales

Unnamed: 0,Region,Segment,year,total_sales
0,APAC,Channel Partners,2022,296720.1
1,APAC,Channel Partners,2023,354735.4
2,APAC,Enterprise,2022,274177.2
3,APAC,Enterprise,2023,259736.4
4,APAC,Government,2022,840175.5
5,APAC,Government,2023,1089334.0
6,APAC,Midmarket,2022,617042.8
7,APAC,Midmarket,2023,563486.1
8,APAC,Small Business,2022,372555.7
9,APAC,Small Business,2023,377784.0


In [105]:
# using shift to find yoy variation

def multi_row(group):
    
    group['prior_sales'] = group['total_sales'].shift(1)
    
    yoy_sales_list = []
    
    for index, row in group.iterrows():
    
        row['yoy_sales'] = ((row['total_sales'] - row['prior_sales'])/row['prior_sales'])*100
        
        yoy_sales = row['yoy_sales']
        
        yoy_sales_list.append(yoy_sales)
    
    group['yoy_variation'] = yoy_sales_list

    return group

In [112]:
# applying function

df_variation = df_sales.groupby(['Region', 'Segment']).apply(multi_row).reset_index(drop = True)

df_variation.head()

Unnamed: 0,Region,Segment,year,total_sales,prior_sales,yoy_variation
0,APAC,Channel Partners,2022,296720.122271,,
1,APAC,Channel Partners,2023,354735.434574,296720.122271,19.5522
2,APAC,Enterprise,2022,274177.242683,,
3,APAC,Enterprise,2023,259736.397979,274177.242683,-5.266974
4,APAC,Government,2022,840175.462485,,


In [113]:
# finding total variation by region and segment

df_variation = df_variation.groupby(['Region', 'Segment']).agg(total_variation = ('yoy_variation', 'sum'))

df_variation.reset_index(inplace = True)

In [117]:
# filtering out where variation is negative with query

df_queried = df_variation.query("total_variation<0")

df_queried

Unnamed: 0,Region,Segment,total_variation
1,APAC,Enterprise,-5.266974
3,APAC,Midmarket,-8.679589
5,Europe,Channel Partners,-5.46722
6,Europe,Enterprise,-33.71007
7,Europe,Government,-25.839295
8,Europe,Midmarket,-34.231508
11,USA,Enterprise,-27.562602
13,USA,Midmarket,-10.006724
14,USA,Small Business,-5.063989


In [120]:
# counting by segment

df_final = df_queried['Segment'].value_counts().reset_index()

df_final

Unnamed: 0,Segment,count
0,Enterprise,3
1,Midmarket,3
2,Channel Partners,1
3,Government,1
4,Small Business,1


In [122]:
# filtering to where count is greater than 3

df_final.query("count > 1")

Unnamed: 0,Segment,count
0,Enterprise,3
1,Midmarket,3
