# Market Simulator
ML for trading Udacity Course exercise

More info:
https://quantsoftware.gatech.edu/Marketsim

A transcription of the Udacity Course lectures can be find on https://docs.google.com/document/d/1ELqlnuTSdc9-MDHOkV0uvSY4RmI1eslyQlU9DgOY_jc/edit?usp=sharing

Kairoart 2018



## Overview

In this project you will create a market simulator that accepts trading orders and keeps track of a portfolio's value over time and then assesses the performance of that portfolio. 

## Part 1: Create a market simulation function

Your job is to implement your market simulator as a function, compute_portvals() that returns a dataframe with one column. It should adhere to the following API: 

    def compute_portvals(orders_file = "./orders/orders.csv", start_val = 1000000, commission = 9.95, impact = 0.005):
        # TODO: Your code here
        return portvals

The start date and end date of the simulation are the first and last dates with orders in the orders_file. The arguments are as follows:

    orders_file is the name of a file from which to read orders, and
    start_val is the starting value of the portfolio (initial cash available)
    commission is the fixed amount in dollars charged for each transaction (both entry and exit)
    impact is the amount the price moves against the trader compared to the historical data at each transaction

Return the result (portvals) as a single-column pandas.DataFrame (column name does not matter), containing the value of the portfolio for each trading day in the first column from start_date to end_date, inclusive.

The files containing orders are CSV files with the following columns:

    Date (yyyy-mm-dd)
    Symbol (e.g. AAPL, GOOG)
    Order (BUY or SELL)
    Shares (no. of shares to trade)

For example: 

    Date,Symbol,Order,Shares
    2008-12-3,AAPL,BUY,130
    2008-12-8,AAPL,SELL,130
    2008-12-5,IBM,BUY,50


### Goal

Your simulator should calculate the total value of the portfolio for each day using adjusted closing prices (cash plus value of equities) and print the result to the file values.csv. The contents of the values.csv file should look something like this:

    2008, 12, 3, 1000000
    2008, 12, 4, 1000010
    2008, 12, 5, 1000250
    ...


### How it should work

Your code should keep account of how many shares of each stock are in the portfolio on each day and how much cash is available on each day. Note that negative shares and negative cash are possible. Negative shares mean that the portfolio is in a short position for that stock. Negative cash means that you've borrowed money from the broker.

When a BUY order occurs, you should add the appropriate number of shares to the count for that stock and subtract the appropriate cost of the shares from the cash account. The cost should be determined using the adjusted close price for that stock on that day.

When a SELL order occurs, it works in reverse: You should subtract the number of shares from the count and add to the cash account. 

### Report

    * Plot the price history over the trading period.
    * Sharpe ratio (Always assume you have 252 trading days in an year. And risk free rate = 0) of the total portfolio
    * Cumulative return of the total portfolio
    * Standard deviation of daily returns of the total portfolio
    * Average daily return of the total portfolio
    * Ending value of the portfolio

In [1]:
import datetime as dt
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import operator
import sys
import csv
import math

# To fetch data
from pandas_datareader import data as pdr   
import fix_yahoo_finance as yf  
yf.pdr_override()   


### Read orders from file

In [69]:
def readOrdersFileIntoDF(filename):

    ordersDataFrame = pd.read_csv(filename,  parse_dates=['Date'], index_col="Date")
    
    
    # Getting the Symbols from the Orders. This list will be required for fetching the prices
    symbolList = list(set(ordersDataFrame['Symbol']))
    
    # Returning it.
    return ordersDataFrame, symbolList

### Get data from Yahoo for the given dates 

In [70]:
def fetchData(dt_start, dt_end, ls_symbols):

    # Add a day to dt_end for Yahoo purpose
    dt_end = pd.to_datetime(dt_end) + pd.DateOffset(days=1)
    
    # Get data of trading days between the start and the end.
    df = pdr.get_data_yahoo(
            # tickers list (single tickers accepts a string as well)
            tickers = ls_symbols,

            # start date (YYYY-MM-DD / datetime.datetime object)
            # (optional, defaults is 1950-01-01)
            start = dt_start,

            # end date (YYYY-MM-DD / datetime.datetime object)
            # (optional, defaults is Today)
            end = dt_end,

            # return a multi-index dataframe
            # (optional, default is Panel, which is deprecated)
            as_panel = False,

            # group by ticker (to access via data['SPY'])
            # (optional, default is 'column')
            group_by = 'ticker',

            # adjust all OHLC automatically
            # (optional, default is False)
            auto_adjust = False
    )
        
    # Getting the numpy ndarray of Adj Close prices.
    adj_close_price = df.loc[:, (slice(None), ('Adj Close'))]

    
    # returning the Adj Closed prices for all the days    
    return adj_close_price

### Market simulator

In [86]:
def marketsim(initialCash, ordersdf, symbols):

    # reading the boundary dates
    dt_start = ordersdf.index[0]
    dt_end = ordersdf.index[len(ordersdf)-1]

    # All the adjustedClosingPrices fetched from NYSE within the range and for given symbols
    closingPrices = fetchData(dt_start, dt_end, symbols)
    num_tradingDays = len(closingPrices)
    #print(closingPrices)
    
    #Cash for the days
    temp = np.zeros((num_tradingDays, 1))
    cash = pd.DataFrame(temp, columns = ['cashinhand'])
    
    #Value for the days
    temp = np.zeros((num_tradingDays, 1))
    valueFrame = pd.DataFrame(temp, columns = ['valueOfPortfolio'])

    #Setting the first value to be the initial cash amount.
    cash.cashinhand.loc[0] = initialCash
    
    index = 0
 
    for tradingDayIndex in range(num_tradingDays):  
        if tradingDayIndex != 0:
            cash.cashinhand.iloc[tradingDayIndex] = cash.cashinhand.iloc[tradingDayIndex - 1] 
        else:
            cash.cashinhand.iloc[tradingDayIndex] = initialCash
          
        for tradingOrder in ordersdf.index:
            if tradingOrder == closingPrices.index[tradingDayIndex]:
                if ordersdf.Order.iloc[index] == 'BUY':
                    toBuySymbol = ordersdf.Symbol.iloc[index]
                    toBuy = symbols.index(toBuySymbol)
                    numShares = ordersdf.Shares.iloc[index]
                    priceForTheDay = closingPrices.iloc[tradingDayIndex][toBuy]
                    cash.cashinhand.iloc[tradingDayIndex] = cash.cashinhand.iloc[tradingDayIndex] - (priceForTheDay * float(numShares))
                    print("Date: ", tradingOrder,
                          "Symbol: ", toBuySymbol,
                          "Order: ", ordersdf.Order.iloc[index],
                          "Shares: ", numShares,
                          "Price: ", priceForTheDay,
                          "Cash: ", cash.cashinhand.iloc[tradingDayIndex] )
                elif ordersdf.Order.iloc[index] == 'SELL':
                    toSellSymbol = ordersdf.Symbol.iloc[index]
                    toSell = symbols.index(toSellSymbol)
                    numShares = ordersdf.Shares.iloc[index]
                    priceForTheDay = closingPrices.iloc[tradingDayIndex][toSell]
                    cash.cashinhand.iloc[tradingDayIndex] = cash.cashinhand.iloc[tradingDayIndex] + (priceForTheDay * float(numShares))
                    print("Date: ", tradingOrder,
                          "Symbol: ", toSellSymbol,
                          "Order: ", ordersdf.Order.iloc[index],
                          "Shares: ", numShares,
                          "Price: ", priceForTheDay,
                          "Cash: ", cash.cashinhand.iloc[tradingDayIndex] )
                else:
                    print("error")
                index+=1
        

        valueFrame['valueOfPortfolio'] = cash['cashinhand'].values
        
    valueFrame.index = closingPrices.index

    return valueFrame

### Define market simulation function

In [87]:
def compute_portvals(orders_file = "./orders/orders-01.csv", start_val = 1000000, commission = 9.95, impact = 0.005):
    # TODO: Your code here
    # Reading the data from the file, and getting a NumPy matrix
    ordersDataFrame, symbols = readOrdersFileIntoDF(orders_file)

    # Getting data from marketsim
    print("List of transactions")
    valueFrame = marketsim(start_val, ordersDataFrame, symbols)
    return valueFrame


In [88]:
compute_portvals()

List of transactions
[*********************100%***********************]  4 of 4 downloaded
Date:  2011-01-10 00:00:00 Symbol:  AAPL Order:  BUY Shares:  1500 Price:  43.356891999999995 Cash:  934964.662
Date:  2011-01-13 00:00:00 Symbol:  AAPL Order:  SELL Shares:  1500 Price:  43.765831 Cash:  1000613.4085
Date:  2011-01-13 00:00:00 Symbol:  IBM Order:  BUY Shares:  4000 Price:  117.880554 Cash:  529091.1925
Date:  2011-01-26 00:00:00 Symbol:  GOOG Order:  BUY Shares:  1000 Price:  306.258087 Cash:  222833.1055
Date:  2011-02-02 00:00:00 Symbol:  XOM Order:  SELL Shares:  4000 Price:  65.633003 Cash:  485365.1175
Date:  2011-02-10 00:00:00 Symbol:  XOM Order:  BUY Shares:  4000 Price:  65.81276700000001 Cash:  222114.04949999996
Date:  2011-03-03 00:00:00 Symbol:  GOOG Order:  SELL Shares:  1000 Price:  302.810516 Cash:  524924.5655
Date:  2011-03-03 00:00:00 Symbol:  GOOG Order:  SELL Shares:  2200 Price:  302.810516 Cash:  1191107.7007
Date:  2011-05-03 00:00:00 Symbol:  IBM Order: 

Unnamed: 0_level_0,valueOfPortfolio
Date,Unnamed: 1_level_1
2011-01-10,9.349647e+05
2011-01-11,9.349647e+05
2011-01-12,9.349647e+05
2011-01-13,5.290912e+05
2011-01-14,5.290912e+05
2011-01-18,5.290912e+05
2011-01-19,5.290912e+05
2011-01-20,5.290912e+05
2011-01-21,5.290912e+05
2011-01-24,5.290912e+05


## Part 2: Part 2: Transaction Costs
