# Numpy Example - Foreign Exchange Rate Data
## A Simple Case Study About Vacation Decision
Datafile: "14_Foreign_Exchange_Rates_PureNumeric.csv.csv"

In [None]:
# import the tools numpy
import numpy as np

### Suppose you want to take a vacation on next July. You have several places you want to visit in your mind : Australia, United Kingdom and Singapore. Because you want to try to explore the visited country carefully, you can just choose one country among them and you plan to stay there for a month (30 days). 
### However, these three destinations are all appealing to you, you find it's hard to make the final decision. Recently, you find some data that contain the exchange rate for several countries. You decide to take the estimated cost to help you make the decision.

### Let's start with reading the data
Just pick a favorite way from the previous example(Foreign Exchange Rate Dataset File Reading)to read the data. Be careful, there are some minor changes in the codes below.

In [None]:
# Define a "ShowData" function - note the default value for the (now) optional parameter.
#  dataset is a list of lists of strings.
def ShowData(dataset = [["No dataset sent"]]):
    for r in dataset:
        # print elements in a tab-separated format
        print (*r, sep ="\t")

# sample calls
ShowData([["one", "two", "three"], ["four", "five", "six"], ["seven", "eight", "nine"]])
#show()

In [None]:
#
# Initial version - "standard programming"
#
# Define a list for the data.  Will be a list of lists.
data = []
# open the file
fname = "../data/14_Foreign_Exchange_Rates_PureNumeric.csv"
f = open(fname, "r")
# ignore the first 1 lines
for i in range(2):
    line = f.readline()
# loop until we run out of lines
while (line):
    # strip the newline and tokenize (split on commas, in this case)
    tokens = line.rstrip().split(',')
    # pick the target columns
    target = [float(tokens[2]), float(tokens[5]),float(tokens[14])]
    # append this record to the dataset
    data.append(target)
    # read the next line
    line = f.readline()
# close the file
f.close()
# show the data
ShowData(data)

### Generally, it is always better to make sure that you select the correct data points. By checking the csv file, we can make sure that in our output data we have 3 columns:  Australia data; United Kingdom data and Singapore data.

### Next, you start to search the websites to find what interesting places you could visit in these three countries and the expenses for food, accommodation, travel and so on. After a careful investigation, you have the following findings:
#### Daily Expense in Australia : 500 Australia Dollar 
#### Daily Expense in United Kingdom: 275 Pound
#### Daily Expense in Singapore: 580 Singapore Dollar

## There comes a question: what exchange rate should I use? 
#### There is no "Correct" answer, it depends on you. You can use the maximum value, minimum value, the latest value, the mean value of last year and so on. But you need to understand that different choice may provide different result and then influence your final decision.

### Here we will show three examples. We will use the mean and min value of whole time and mean value of july in 2019. 
### The first is using mean value of whole time
If you still remember the previous example, we write our own code for calculating the mean value by summing values of each row and then divide it by it's length. But we can use Numpy to simplify this process.

In [None]:
# Convert data to a numpy array
npdata = np.array(data)
npdata

In [None]:
# Australia mean exchange rate
AUSMean = npdata[:,0].mean()
# United Kingdom mean exchange rate
UKMean = npdata[:,1].mean()
# Singapore mean exchange rate
SGMean = npdata[:,2].mean()

In [None]:
AUSMean

#### You can find that this is exactly the same value in our previous example. So, what are the estimated costs in US dollar?

In [None]:
# Estimated cost in AUS
AUSCost = 500/AUSMean *30
# Estimated cost in UK
UKCost = 275/UKMean *30
# Estimated cost in SG
SGCost = 580/SGMean *30
# print results
print("The estimated cost for travelling to Australia is ${:.2f} ".format(AUSCost))
print("The estimated cost for travelling to United Kingdom is ${:.2f} ".format(UKCost))
print("The estimated cost for travelling to Singapore is ${:.2f} ".format(SGCost))

## According to the above estimated costs, we know, if we use the mean value(of whole time), the estimated cost for having a vacation in Australia is the lowest one.
### Now, let's use minimum value of whole time

In [None]:
# Australia max exchange rate
AUSMin = npdata[:,0].min()
# United Kingdom max exchange rate
UKMin = npdata[:,1].min()
# Singapore max exchange rate
SGMin = npdata[:,2].min()

In [None]:
# Estimated cost in AUS
AUSCost = 500/AUSMin *30
# Estimated cost in UK
UKCost = 275/UKMin *30
# Estimated cost in SG
SGCost = 580/SGMin *30

# print results
print("The estimated cost for travelling to Australia is ${:.2f} ".format(AUSCost))
print("The estimated cost for travelling to United Kingdom is ${:.2f} ".format(UKCost))
print("The estimated cost for travelling to Singapore is ${:.2f} ".format(SGCost))

## So, when we use minimum value(of whole time), we get a different result that the estimated cost for having a vacation in Singapore is the lowest one.
### Last, let's use mean value of July in 2019

In [None]:
# Get the date column
# Initial version - "standard programming"
#
# Define a list for the data.  Will be a list of lists.
data = []
# open the file
fname = "../data/14_Foreign_Exchange_Rates_PureNumeric.csv"
f = open(fname, "r")
# ignore the first 1 lines
for i in range(2):
    line = f.readline()
# loop until we run out of lines
while (line):
    # strip the newline and tokenize (split on commas, in this case)
    tokens = line.rstrip().split(',')
    # pick the target columns
    target = [tokens[1]]
    # append this record to the dataset
    data.append(target)
    # read the next line
    line = f.readline()
# close the file
f.close()
# show the data
ShowData(data)

In [None]:
# Convert the date column to Numpy array
date = np.array(data)
date

In [None]:
# Find the indices for the first and last day of Jul. 2019
# YOU NEED TO MAKE SURE THE DATES REALLY EXIST IN YOUR DATA
FirstDate = '7/1/2019'
LastDate = '7/31/2019'
FirstIndex = np.where(date == FirstDate)
LastIndex = np.where(date == LastDate)
print("The {}th row is for date {}'".format(FirstIndex[0][0],date[FirstIndex]))
print("The {}th row is for date {}'".format(LastIndex[0][0],date[LastIndex]))

#### We can use the indices we get from date in npdata

In [None]:
# Let's calculte the mean values of the exchange rate
# Australia mean exchange rate
AUSJulMean = npdata[FirstIndex[0][0]:LastIndex[0][0]+1,0].mean()
# United Kingdom mean exchange rate
UKJulMean = npdata[FirstIndex[0][0]:LastIndex[0][0]+1,1].mean()
# Singapore mean exchange rate
SGJulMean = npdata[FirstIndex[0][0]:LastIndex[0][0]+1,2].mean()

In [None]:
# Estimated cost in AUS
AUSCost = 500/AUSJulMean *30
# Estimated cost in UK
UKCost = 275/UKJulMean *30
# Estimated cost in SG
SGCost = 580/SGJulMean *30

# print results
print("The estimated cost for travelling to Australia is ${:.2f} ".format(AUSCost))
print("The estimated cost for travelling to United Kingdom is ${:.2f} ".format(UKCost))
print("The estimated cost for travelling to Singapore is ${:.2f} ".format(SGCost))

 ## So, based on the mean values of July in 2019, the estimated cost for having a vacation in United Kingdom is the lowest one.

## In this small example, we show how to use the powerful package 'Numpy' to calculate some values and help us make decision. Comparing with the previous example, we can figure out by using some existing package, we don't need to consider how to build our own code to compute some values. 