## Part 2

In this section, we will calculate the correlations between all stocks using returns calculated in part 1. We will also write functions that allow users to conveniently:
- print out correlations between two companies 
- print out top and bottom correlated companies of a specified company

### Correlation dataframe using panda

Firstly, we calcuated the correlations between companies using the built-in function pd.corr() in the Panda library. 

In [None]:
def corTable(returns):
    # Input: a panda dataframe consisting the returns of all the stocks
    # Output: a symmetric panda dataframe with correlations between all companies
    # this uses the built-in function. The manual calculation is as below.
    return returns.corr()

With this corTable function we can now create our correlationTable using returns from part 1. 

In [None]:
# store correlation results (to use as input for other functions)
correlationTable = corTable(returns) 

#### Checking panda results with direct calculation from definition using Python

In order to check that the Panda built-in functions has done its job accurately, we created a function to calculate the correlation manually from definition and compare with the number obtained from panda. 

Firstly we need a panda dataframe that fetch company full name from its abbreviations:

In [None]:
compData = pd.read_csv('SP_500_firms.csv', index_col = 0)

This is the function to check if panda correlation results are same as calculating manually:

In [None]:
##### Compare panda method and python manual method of calculating correlations
def testCor(correlationTable, companyA, companyB):
    print('Panda method:')
    print(correlationTable.loc[companyA,companyB])
    print('Standard data structure method')
    a,b = np.array(returns.get(companyA).tolist(),dtype = float),np.array(returns.get(companyB).tolist(),dtype = float)
    print(np.sum((a - np.mean(a))/np.std(a)*(b - np.mean(b))/np.std(b))/(len(a)))

For example, if we wish to test the correlation between Google and Facebook is calculated correctly, we would run the following:

In [None]:
# To test if built-in and manual way of finding correlations are the same
testCor(correlationTable, 'GOOGL', 'FB')

### Printing correlation between two companies

Here is the function we created to print the correlation between two specified companies by fetching the corresponding entry in the correlationTable. 

In [None]:
##### Print correlation between company A and B
def printCor(correlationTable, companyA, companyB):
    corr = correlationTable.loc[companyA,companyB]
    nameA = compData.loc[companyA,'Name']
    nameB = compData.loc[companyB,'Name']
    return nameA, nameB, corr

For example, if we wish to print the correlation between Amazon and Facebook, then we would:

In [None]:
printCor(correlationTable, 'FB', 'AMZN')

### Printing top and bottom correlated companies

Here is the function we created to print the top and bottom (most positively and negatively correlated) companies of a specified company. 

In [None]:
##### List top correlated companies of a company   
def topCor(correlationTable,company):
    min = correlationTable[company].sort_values()[0:5]
    max = correlationTable[company].sort_values(ascending=False)[1:6]
    list1 = []
    list2 = []
    for i in min.index:
        list1.append(compData.loc[i,'Name'])
    for i in max.index:
        list2.append(compData.loc[i,'Name'])
    min.index = list1
    max.index = list2
    print('Most -ve correlated:')
    print(min)
    print('Most +ve correlated:')
    print(max)

We then used the function topCor to find the top and bottom correlated companies of Apple, Amazon, Google, Facebook and Microsoft. 

In [None]:
topCor(correlationTable,'AAPL')
topCor(correlationTable,'AMZN')
topCor(correlationTable,'MSFT')
topCor(correlationTable,'FB')
topCor(correlationTable,'GOOGL')

#### Interpreting the above results

These are some interesting results and some are expected:

Google:
- Its very high positive correlation with Alphabet Inc is expected since Alphabet is the parent company of Google. If the value of Google goes up the value of Alphabet Inc is likely to good up, and vice versa. 
- Its high positive correlation with giant tech companies such as Facebook and Amazon is also expected since they are very similar companies and hence would perform similarly in the same market conditions. 

Facebook:
- Its high positive correlation with Mastercard and Fiserv (both technological financial service companies) is expected because Facebook is a platform that help advertising and attracting people shop pay and shop online and hence helping these companies to expand their businesses. 


Amazon:
- Its positive correlation with Alphabet is expected due to the similar type of companies they are. Tech companies are likely to do similarly over time under the same market conditions
- Its positive correlation with Mastercard and Visa is expected because they are the two most popular ways to pay online and hence Visa and Mastercard would benefit form good performance of Amazon. 

Apple: 
- Skywork and Illinois Tool Works are electronic component manufacturers (especially Skywork where they make wireless handset chips) and perhaps Apple manufacture products which require chips and components from these companies and hence their stock performance would be correlated; when Apple announces a new product, causing its value to rise, these electronic components manufacturer is likely to make a lot of money through selling components to Apple. 