# Analyzing Stock Data

It's time to put your Python skills and financial knowledge to the test!

You are given the monthly stock prices of two E-commerce companies, Amazon (AMZN) and eBay (EBAY). Help us analyze the risk and return for each investment! You will calculate the rates of return from this data, as well as other key statistics such as variance and correlation for assessing risk.

Let's get started!

## Inspect Data and Code

1. The code block below provides the monthly stock prices of Amazon (AMZN) and eBay (EBAY) over the course of a year (June 2018 to June 2019). You are also given a helper function that displays decimal values (such as `0.075`) in percentage form (`7.5%`).

   The code block also provides functions for calculating the log rate of return, variance, standard deviation, and correlation coefficient. 
   
   Take a moment to familiarize yourself with these functions and proceed to the next step when you are ready!

In [1]:
def display_as_percentage(val):
    return '{:.1f}%'.format(val*100)

amazon_prices = [1699.8, 1777.44, 2012.71, 2003.0, 1598.01, 1690.17, 1501.97, 1718.73, 1639.83, 1780.75, 1926.52, 1775.07, 1893.63]
ebay_prices = [35.98, 33.2, 34.35, 32.77, 28.81, 29.62, 27.86, 33.39, 37.01, 37.0, 38.6, 35.93, 39.5]

from math import log, sqrt

# Calculate Log Return
def calculate_log_return(start_price, end_price):
    return log(end_price / start_price)

# Calculate Variance
def calculate_variance(dataset):
    mean = sum(dataset)/len(dataset)
    numerator = 0
    for data in dataset:
        numerator += (data-mean) ** 2
    return numerator / len(dataset)

# Calculate Standard Deviation
def calculate_stddev(dataset):
    variance = calculate_variance(dataset)
    return sqrt(variance)

# Calculate Correlation Coefficient
def calculate_correlation(set_x, set_y):
    sum_x = sum(set_x)
    sum_y = sum(set_y)
    
    sum_x2 = sum([x ** 2 for x in set_x])
    sum_y2 = sum([y ** 2 for y in set_y])
    sum_xy = sum([x * y for x,y in zip(set_x, set_y)])
    
    n = len(set_x)
    numerator = n * sum_xy - sum_x * sum_y
    denominator = sqrt((n * sum_x2 - sum_x ** 2) * (n * sum_y2 - sum_y ** 2))
    return numerator / denominator

## Calculate Rate of Return

2. Let's start by calculating the logarithmic rates of return from the stock prices. Define a function called `get_returns()` that takes a parameter called `prices`, which will be a list of stock prices.

   The function will eventually return a list of log returns calculated from each adjacent pair of prices. For now, create a variable called `returns` inside the function body, and set it equal to an empty list.

In [8]:
def get_returns (prices):
    returns = []
    for i in range (len (prices) - 1):
        returns.append (calculate_log_return (prices[i], prices[i + 1]))
    return returns

3. Next, use a `for` loop to iterate over the `prices` list, from the 1st element to the 2nd to last.

   You will be accessing the elements by their indices, so use the `range()` function to help generate the sequence of numbers to iterate over. Recall that Python uses zero-based indexing, meaning the index numbering of the list starts with `0` and ends with `n - 1`, where `n` is the number of elements in the list.
   
   Comment out the `for` loop for now so that the kernel doesn't throw an error at you.

4. As you iterate over each index `i`, the element in the `prices` list that is at position `i` will be the start price and the element with index `i + 1` will be the end price. Use the `calculate_log_return()` function to calculate the rate of return from the start and end prices. Then, append the rate of return to the `returns` list.

   After your for loop, return `returns` from the `get_returns()` function.

5. Use the `get_returns()` function to find the monthly log rates of return from the Amazon and eBay stock prices. Store those list of returns in the variables `amazon_returns` and `ebay_returns`, respectively.

In [9]:
amazon_returns = get_returns (amazon_prices)
ebay_returns = get_returns (ebay_prices)

6. Time to print out the lists of monthly returns! Since rates of return is often expressed as a percentage, use the `display_as_percentage()` function and list comprehension to display each value in `amazon_returns` and `ebay_returns` as a percentage.

   How do the monthly returns of the two stocks compare? Are they on average profitable?

In [12]:
print ("Amazon returns:", ', '.join ([display_as_percentage (r) for r in amazon_returns]))
print ("eBay returns:", ', '.join ([display_as_percentage (r) for r in ebay_returns]))

Amazon returns: 4.5%, 12.4%, -0.5%, -22.6%, 5.6%, -11.8%, 13.5%, -4.7%, 8.2%, 7.9%, -8.2%, 6.5%
eBay returns: -8.0%, 3.4%, -4.7%, -12.9%, 2.8%, -6.1%, 18.1%, 10.3%, -0.0%, 4.2%, -7.2%, 9.5%


7. Now, let's calculate the annual rate of return for each stock!

   Recall that log returns can easily be aggregated over time. Since `amazon_returns` and `ebay_returns` contain the monthly log returns for all 12 months in the past year, the annual return is simply the sum of all monthly returns. Use the `display_as_percentage()` function to help format the annual return as a percentage when you print out the results.
   
   How do the annual returns of the two stocks compare?

In [13]:
amazon_annual_return = sum (amazon_returns)
ebay_annual_return = sum (ebay_returns)

print ("Amazon annual return:", display_as_percentage (amazon_annual_return))
print ("eBay annual return:", display_as_percentage (ebay_annual_return))

Amazon annual return: 10.8%
eBay annual return: 9.3%


## Assess Investment Risk

8. Let's move on to assessing the risk of each investment! Start by calculating the variance of each stock's monthly returns. Use the `calculate_variance()` function we provided in the first task and print out the results.

   How do the variance for each stock compare? What does this tell you about their relative risk?

In [14]:
amazon_variance = calculate_variance (amazon_returns)
ebay_variance = calculate_variance (ebay_returns)

print ("Amazon variance:", amazon_variance)
print ("eBay variance:", ebay_variance)

Amazon variance: 0.010738060556609724
eBay variance: 0.007459046435081462


9. Now, calculate the standard deviation of each stock's monthly returns using the `calculate_stddev()` function, and print out the results.

   Recall that the standard deviation has the same unit of measurement as the original dataset, or the monthly returns in this case. Since rates of return are often expressed as a percentage, use `display_as_percentage()` to help format the standard deviation for easier interpretation.

In [15]:
amazon_stddev = calculate_stddev (amazon_returns)
ebay_stddev = calculate_stddev (ebay_returns)

print ("Amazon variance:", display_as_percentage (amazon_stddev))
print ("eBay variance:", display_as_percentage (ebay_stddev))

Amazon variance: 10.4%
eBay variance: 8.6%


10. Finally, calculate the correlation between the stock returns using the `calculate_correlation()` function, and print out the results.

    Are Amazon and eBay stock returns strongly or weakly correlated? Is the correlation positive or negative?

In [16]:
correlation_amazon_ebay = calculate_correlation (amazon_returns, ebay_returns)
print ("Correlation between the returns of Amazon and eBay stocks:", correlation_amazon_ebay)

Correlation between the returns of Amazon and eBay stocks: 0.6776978564073072
