# &#x1F4D1; &nbsp; <span style="color:#338DD4"> Reflections. Machine Learning for Trading. Lessons 2</span>

##  &#x1F4CA; &nbsp; Links

Predictive Analytics: https://practicalanalytics.co/predictive-analytics-101/

Machine Learning for Trading: https://www.overleaf.com/articles/machine-learning-for-trading/bbckwqwmdnrw/viewer.pdf

INVESTMENT PORTFOLIO OPTIMISATION WITH PYTHON: http://www.pythonforfinance.net/2017/01/21/investment-portfolio-optimisation-with-python/

Introduction to Portfolio Analysis in R: https://www.datacamp.com/courses/introduction-to-portfolio-analysis-in-r

DX Analytics: http://dx-analytics.com/index.html

PyMacLab: https://github.com/escheffel/pymaclab/

Technical Indicators in Python: https://www.quantinsti.com/blog/build-technical-indicators-in-python/

Weak, Semi-Strong and Strong EMH: http://www.investopedia.com/exam-guide/cfa-level-1/securities-markets/weak-semistrong-strong-emh-efficient-market-hypothesis.asp

The Sharpe Ratio and the Information Ratio: http://www.cfapubs.org/doi/full/10.2469/ipmn.v2011.n1.7?src=recsys

Modern data: http://moderndata.plot.ly/portfolio-optimization-using-r-and-plotly/

Currency Portfolio Optimization Using ScienceOps: http://blog.yhat.com/posts/currency-portfolio-optimization-using-scienceops.html

### 02-01 So you want to be a hedge fund manager?

#### Analytics

Analytics for simplicity:
    
- What happened, where and when â Descriptive Analytics
- Why did it happen â Diagnostic or Prescriptive Analytics
- What is likely to happen â Predictive Analytics
- Guided actions and steps â Machine Learning, AI and Cognitive Learning
- Conversational AI â Chatbots

Predictive analytics leverages four core techniques to turn data into valuable, actionable information:

- Predictive modeling
- Decision Analysis and Optimization
- Transaction Profiling
- Predictive Search (supervised machine learning)

#### Fund Types

An **exchange-traded** fund (ETF) is an investment fund traded on stock exchanges, much like stocks. An ETF holds assets such as stocks, commodities, or bonds, and trades close to its net asset value over the course of the trading day. Most ETFs track an index, such as a stock index or bond index. ETFs may be attractive as investments because of their low costs, tax efficiency, and stock-like features. By 2013, ETFs were the most popular type of exchange-traded product.

A **mutual** fund is a professionally managed investment fund that pools money from many investors to purchase securities. While there is no legal definition of the term "mutual fund", it is most commonly applied to open-end investment companies, which are collective investment vehicles that are regulated and sold to the general public on a daily basis. They are sometimes referred to as "investment companies" or "registered investment companies".

A **hedge** fund is an investment fund that pools capital from accredited individuals or institutional investors and invests in a variety of assets, often with complex portfolio-construction and risk-management techniques. It is administered by a professional investment management firm, and often structured as a limited partnership, limited liability company, or similar vehicle.

In [None]:
# Quiz: What Type Of Fund Is It?
# MEMHE

Historical data, current portfolio, and prediction algorithms are fed into an optimization
program to produce a target portfolio. The majority of machine learning comes into play
when determining the market forecast.

Three Essential Metrics of Mutual Funds:
1.  Identify a Proper Benchmark for the Fund by Utilizing R-Squared
2.  Compare the Fundâs Volatility Relative to the Benchmark Using Beta
3.  Determine Fundâs Risk-Adjusted Excess Performance or Alpha

The **expense ratio** is the annual fee that all funds or ETFs charge their shareholders. It expresses the percentage of assets deducted each fiscal year for fund expenses, including 12b-1 fees, management fees, administrative fees, operating costs, and all other asset-based costs incurred by the fund.



**'Two and twenty'** is a type of compensation structure that hedge fund managers typically employ in which part of compensation is performance-based. This phrase refers to how hedge fund managers charge a flat 2% of total asset value as a management fee and an additional 20% of any profits earned.


### 02-02 Market Mechanics

An **order book** is the list of orders (manual or electronic) that a trading venue (in particular stock exchanges) uses to record the interest of buyers and sellers in a particular financial instrument. A matching engine uses the book to determine which orders can be fulfilled i.e. what trades can be made.

The live portfolio is altered by giving orders to a broker in the stock market, so it serves to
know what exactly is in an order. The broker needs to know whether to buy or sell and of
which stock(s) by their market symbols. The order must also contain the number of shares
and the type of order. Stock exchanges only consider limit orders and market orders, but
note that orders can be increasingly complex based on instructions given to a broker. A
market order is an order at the current market price, and ordering at a limit price tells the
broker to only buy or sell at a certain price; for example, some may not want to buy beyond
some value or sell below some price. Of course, a limit order must also include the desired
price.

After the broker sends the order to the stock exchange, it is made public in the style
of an âorder bookâ. Others can see the stocks and collective bids that have been made on
them, but not who has placed the orders. The order book contains a list for each stock of
the orders within it including whether the order asks for others to buy or bids on the stock.
Both types include a price at which orders are allowed to be bought/sold at and the size of
the order. Orders of the same type and price are lumped together. Market orders always
sell at the highest bid and buy at the lowest asking price.


Dynamics of Exchange

There are many stock exchanges, and each has its own order book. When an order is placed,
say by an individual, the order is sent to the broker and the broker chooses between the
stock exchanges to execute that order. The broker takes information from all the stock
exchanges and makes a transaction based on which one has the stock with the best price.
Fees are associated with making transactions in stock exchanges and with using a broker. A
broker typically has many clients; the broker can observe clients who want to buy and sell
at the same price and circumvent stock exchanges entirely. The law ensures that this trade
can only happen if both the buyer and seller get prices that are at least as good as at an
exchange. Even if this transaction cuts out the stock exchange, it must still be registered
with one, and itâs usually with the exchange the stock is housed.

Orders can be handled internally or moved through whatâs called a âdark poolâ. A dark
pool is a place where investors can trade without the transparency of an order book outside
of a stock exchange. The results of the trade, like internal trades, still need to be registered
with public exchanges after theyâve occured. A dark pool can act as an intermediary between
all types of investors that may want to escape the transparency of stock exchanges.
Brokers like this because they donât have to pay fees associated with trading at a stock
exchange. They also argue that itâs fair because clients are getting prices that are as good
as at the market. However, hedge funds and dark pools can heavily exploit this system as
it stands if they have well-placed, fast computers.

Exploiting the Market

These days the market is entirely digital and computers automate transactions all across
the country. As a result, orders can be processed in fractions of a second, and timing is
everything. A hedge fund may pay stock exchanges enough money to house computers very
close to the exchanges, which gives them a distinct advantage. For example, letâs say that
someone places an order for a stock, and itâs sent to multiple markets. A hedge fund close
to those markets can see that order reach one of them first and buy up that stock from the
other exchanges through the high speed infrastructure they have in place. Then when that
order reaches other exchanges, the hedge fund has already bought those shares and sells it
back at a higher price. This is one of many strategies in High Frequency Trading (HFT)
that takes place on the order of milliseconds.

This can also happen on an international scale. A hedge fund may have computers collocated
with international markets rapidly observing and comparing order books between
them. If a difference occurs in a stock between the two markets, all that needs to be done
is to sell in the market with a higher price and buy in the market with a lower one. This
happens very quickly, so the price of the stock is not very different at different markets.
HFT strategies usually trade high volumes to turn a large profit in small price differences.

Those that operate on HFT strategies can not only manipulate a transparent market,
but also a dark one. First, letâs explain why someone would want to use a dark pool. For
example, an investor who wants to sell a high volume of shares in a transparent market
would want to do so in small chunks so as not to upset the price all at once and get less for
the shares. However, others will see this and lower their bids knowing that a high volume
33 is to be sold and the investor still gets less for their shares. In a dark pool, others canât see
those that want to buy or sell, so the investor may get a better price. There are many ways
to exploit a dark pool, but it always stems from information leakage. Knowing the order
book of a dark pool means a world of advantage. Dark pool operators or constituents may
secretly participate in their own pool or leak information about it to others for a price. The
private nature of a dark pool allows those who operate it to make their own rules about
who can participate and how trading works. Since information is at the discretion of the
operator, itâs fairly easy for those with direct access to exploit a dark pool. Those that
donât have direct access can âgameâ the pool by probing it with many small volume orders.
This yields some idea of the size and prices of bids, which gamers can exploit by selling
when they find the bids are highest and buying when asking is lowest.

Other Orders and Shorting

Although exchanges only take market and limit orders, other orders can be made through
a broker. Often a broker implements them for clients without their knowledge to benefit
both themselves and the client. The most simple order above a limit order is a stop-loss
order. The broker holds stock until the price drops below a certain price, then sells at
market price. Similarly, a stop-gain order waits until the price climbs to a point at which
the client wants to sell. Slightly more complex is the trailing stop. Here the broker sets a
stop-loss criteria trailing a stock thatâs increasing in price. As the price increases, so does
the stop-loss criteria; when that stock starts to decrease, the stop-loss criteria is met and
stocks are sold.

What if someone wanted to bet that a stock will decrease and still profit? Instead of just
selling high and buying low, those stocks can be borrowed and sold, so the value of those
stocks is gained at a high point but the stocks are still owed. Then when the price of the
stock decreases, the stock can be bought at a lower value, and the shares returned to whom
they were borrowed while a profit on the difference was made. This is called shorting. As
long as the price of the stock goes down, this is a good strategy; however, if the price goes
up, then the difference results in a net loss.

### 02-03 What is a company worth?

**Net worth** is the value of all the non-financial and financial assets owned by an institutional unit or sector less the value of all its outstanding liabilities. Thus, net worth can refer to companies, individuals, or governments. It can also refer to economic sectors such as the sector of financial corporations or to entire countries.

**Fair Market Value** is the most probable price which a company or an asset would bring in a competitive and open market (in a fair sale). It is an estimate of the market value that an interested, knowledgeable but not desperate buyer is willing to pay an interested, knowledgeable but not desperate seller, assuming the company or asset is exposed on the open market for a reasonable period of time and the sale is paid using cash or cash equivalent.

**Liquidation value** is the price of a companyâs tangible assets if it goes out of business and needs to be liquidated within limited period of time. Liquidation value is typical lower than fair market value as is it allowed insufficient exposure to the investors in the open market . Intangible assets, including the intellectual properties, reputations and goodwill, are not included in liquidation value. If the company were to be sold rather than liquidated, then the price is called **goingâconcern value** which includes the liquidation value and the present value of its intangible assets.

**Investment value** is the value of a company or an asset to a particular investor, using the investorâs specific judgments and assumptions. It can be much higher than the fair market value to an investor who has ability to put the asset to use in its maximum productive way, and it can also be lower than the fair market value to an investor who has limited ability to make the best use of the property.

In finance, **intrinsic value** refers to the value of a company, stock, currency or product determined through fundamental analysis without reference to its market value. It includes variables such as brand name, patents, copyrights, business model and personal contacts which are difficult to properly value in the open market. 

**Book value** literally means the value of the business according to its "books" or financial statements. In this case, book value is calculated from the balance sheet, and it is the difference between a company's total assets and total liabilities. Note that this is also the term for shareholders' equity. For example, if Company XYZ has total assets of 100 million USD and total liabilities of 80 million USD, the book value of the company is 20 million USD. In a very broad sense, this means that if the company sold off its assets and paid down its liabilities, the equity value or net worth of the business, would be 20 million USD.

**Market value** is the value of a company according to the stock market. Market value is calculated by multiplying a company's shares outstanding by its current market price. If Company XYZ has 1 million shares outstanding and each share trades for 50 USD, then the company's market value is 50 million USD. Market value is most often the number analysts, newspapers and investors refer to when they mention the value of the business.

**Market capitalization (market cap)** is the market value at a point in time of the shares outstanding of a publicly traded company, being equal to the share price at that point of time times the number of shares outstanding.

News about companies can drastically change some of these measures. Investors reflect
their opinions on the worth of a company through stocks- if they feel the company is worth
less, they will sell and vice versa. Letâs say bad news about a company comes up; investors
will see that as increased risk in investing in the company. The company will have to increase
their IR to appease investors and the intrinsic value of the company will reduce. This
would also reduce the stock price of the company, which decreases the market capitalization
of the company. News can affect singular companies, sectors of business, and the market as
a whole depending on the scope of the news.

Market strategies are based on deviations in the estimated values of a company. For
example, if the intrinsic value of a company drops and the stock price is relatively high given
its history, then it would probably be a good idea to short that stock because the price will
almost certainly go down. The book value of a company provides somewhat of a minimum
for the market cap; that is because if the market cap goes below the book value, then a
predatory buyer typically buys the whole company, breaks it apart, and sells its parts for
the book value to turn a profit.

### 02-04 The Capital Assets Pricing Model (CAPM)

is a model used to determine a theoretically appropriate required rates of return of an asset, to make decisions about adding assets to a well-diversified portfolio.

Assumptions of CAPM

All investors:

- Aim to maximize economic utilities (Asset quantities are given and fixed).
- Are rational and risk-averse.
- Are broadly diversified across a range of investments.
- Are price takers, i.e., they cannot influence prices.
- Can lend and borrow unlimited amounts under the risk-free rate of interest.
- Trade without transaction or taxation costs.
- Deal with securities that are all highly divisible into small parcels (All assets are perfectly divisible and liquid).
- Have homogeneous expectations.
- Assume all information is available at the same time to all investors.

A **portfolio** is a set of stocks that are weighted by their value, and all the weights add to 1. Some stocks might be shorted, so technically their portfolio value is negative and really the sum of the absolute value of their weights is 1.


The CAPM predicts the return on stocks within a certain market with the simple equation

ri[t] = Î²i * rm[t] + Î±i[t]

This says that the return of a stock is largely based on the return of the market as a whole,
rm. The degree to which a stock is affected is based on that stocks particular Î² value.
Fluctuations that deviate from this are represented in the (theoretically) random variable,
Î±, which (theoretically) has an expected value of zero. The Î² and Î± values of a stock are
calculated based on the historical data of daily returns. The daily returns of a stock are
plotted against that of the market and the slope of the fitted line constitutes the Î² value.
The y-intercept and random deviations describe the Î± value.

CAPM says that the relationship of stocks to the market is linear with an average fluctuation of zero from this relationship. This suggests that the best tactic is to simply choose a set of stocks that will perform well with a certain market environment and sit on them. Active management is a way of thinking that believes the Î± value is not entirely random and can be predicted. This mindset promotes carefully choosing and trading stocks on a regular basis depending on predicted Î± values. This is the dichotomy between active and passive portfolio management. If we assume that Î± is entirely random, then the only way we can beat the market is by predicting the return on the market.

### 02-05 How hedge funds use the CAPM

The capital asset pricing model was the work of financial economist (and later, Nobel laureate in economics) William Sharpe, set out in his 1970 book "Portfolio Theory and Capital Markets." His model starts with the idea that individual investment contains two types of risk:

Systematic Risk â These are market risks that cannot be diversified away. Interest rates, recessions and wars are examples of systematic risks.

Unsystematic Risk â Also known as "specific risk," this risk is specific to individual stocks and can be diversified away as the investor increases the number of stocks in his or her portfolio. In more technical terms, it represents the component of a stock's return that is not correlated with general market moves.

Modern portfolio theory shows that specific risk can be removed through diversification. The trouble is that diversification still doesn't solve the problem of systematic risk; even a portfolio of all the shares in the stock market can't eliminate that risk. Therefore, when calculating a deserved return, systematic risk is what plagues investors most. CAPM, therefore, evolved as a way to measure this systematic risk.


### 02-06 Technical Analysis

The **commodity channel index (CCI)** is an oscillator which was originally introduced by Donald Lambert in 1980. CCI can be used to identify cyclical turns across asset classes, be it commodities, indices, stocks, or ETFs. CCI is also used by traders to identify overbought/oversold levels for securities.

Estimation

The CCI looks at the relationship between price and a moving average. Steps involved in the estimation of CCI include:

Computing the typical price for the security. Typical price is obtained by the averaging the high, low and the close price for the day.
Finding the moving average for the chosen number of days based on the typical price.
Computing the standard deviation for the same period as that used for the MA.

The formula for CCI is given by:

CCI = (Typical price â MA of Typical price) / (0.015 * Standard deviation of Typical price)

The index is scaled by an inverse factor of 0.015 to provide for more readable numbers.

In [None]:
# Commodity Channel Index Python Code
# https://www.quantinsti.com/blog/build-technical-indicators-in-python/


# Load the necessary packages and modules
import pandas as pd
import pandas.io.data as web
import matplotlib.pyplot as plt
 
# Commodity Channel Index 
def CCI(data, ndays): 
    TP = (data['High'] + data['Low'] + data['Close']) / 3 
    CCI = pd.Series((TP - pd.rolling_mean(TP, ndays)) / (0.015 * pd.rolling_std(TP, ndays)),
    name = 'CCI') 
    data = data.join(CCI) 
    return data
 
# Retrieve the Nifty data from Yahoo finance:
data = web.DataReader('^NSEI',data_source='yahoo',start='1/1/2014', end='1/1/2016')
data = pd.DataFrame(data)
 
# Compute the Commodity Channel Index(CCI) for NIFTY based on the 20-day Moving average
n = 20
NIFTY_CCI = CCI(data, n)
CCI = NIFTY_CCI['CCI']
 
# Plotting the Price Series chart and the Commodity Channel index below
fig = plt.figure(figsize=(7,5))
ax = fig.add_subplot(2, 1, 1)
ax.set_xticklabels([])
plt.plot(data['Close'],lw=1)
plt.title('NSE Price Chart')
plt.ylabel('Close Price')
plt.grid(True)
bx = fig.add_subplot(2, 1, 2)
plt.plot(CCI,'k',lw=0.75,linestyle='-',label='CCI')
plt.legend(loc=2,prop={'size':9.5})
plt.ylabel('CCI values')
plt.grid(True)
plt.setp(plt.gca().get_xticklabels(), rotation=30)

**Ease of Movement (EMV** is a volume-based oscillator which was developed by Richard Arms. EVM indicates the ease with which the prices rise or fall taking into account the volume of the security. For example, a price rise on a low volume means prices advanced with relative ease, and there was little selling pressure. Positive EVM values imply that the market is moving higher with ease, while negative values indicate an easy decline.

Estimation

To calculate the EMV we first calculate the Distance moved. It is given by:


Distance moved = ((Current High + Current Low)/2 - (Prior High + Prior Low)/2)

We then compute the Box ratio which uses the volume and the high-low range:

Box ratio = (Volume / 100,000,000) / (Current High â Current Low)
 
EMV = Distance moved / Box ratio

To compute the n-period EMV we take the n-period simple moving average of the 1-period EMV.

In [None]:
# Ease Of Movement (EVM) Code
 
# Load the necessary packages and modules
import pandas as pd
import pandas.io.data as web
import matplotlib.pyplot as plt
 
# Ease of Movement 
def EVM(data, ndays): 
    dm = ((data['High'] + data['Low'])/2) - ((data['High'].shift(1) + data['Low'].shift(1))/2)
    br = (data['Volume'] / 100000000) / ((data['High'] - data['Low']))
    EVM = dm / br 
    EVM_MA = pd.Series(pd.rolling_mean(EVM, ndays), name = 'EVM') 
    data = data.join(EVM_MA) 
    return data 
 
# Retrieve the AAPL data from Yahoo finance:
data = web.DataReader('AAPL',data_source='yahoo',start='1/1/2015', end='1/1/2016')
data = pd.DataFrame(data)
 
# Compute the 14-day Ease of Movement for AAPL
n = 14
AAPL_EVM = EVM(data, n)
EVM = AAPL_EVM['EVM']
 
# Plotting the Price Series chart and the Ease Of Movement below
fig = plt.figure(figsize=(7,5))
ax = fig.add_subplot(2, 1, 1)
ax.set_xticklabels([])
plt.plot(data['Close'],lw=1)
plt.title('AAPL Price Chart')
plt.ylabel('Close Price')
plt.grid(True)
bx = fig.add_subplot(2, 1, 2)
plt.plot(EVM,'k',lw=0.75,linestyle='-',label='EVM(14)')
plt.legend(loc=2,prop={'size':9})
plt.ylabel('EVM values')
plt.grid(True)
plt.setp(plt.gca().get_xticklabels(), rotation=30)

The **moving average** is one of the most widely used technical indicators. It is used along with other technical indicators or it can form the building block for the computation of other technical indicators.

A âmoving averageâ is average of the asset prices over the âxâ number of days/weeks. The term âmovingâ is used because the group of data moves forward with each new trading day. For each new day, we include the price of that day and exclude the price of the first day in the data sequence.

The most commonly used moving averages are the 5-day, 10-day, 20-day, 50-day, and the 200-day moving averages.

In [None]:
# Load the necessary packages and modules
import pandas as pd
import pandas.io.data as web
import matplotlib.pyplot as plt
 
# Simple Moving Average 
def SMA(data, ndays): 
    SMA = pd.Series(pd.rolling_mean(data['Close'], ndays), name = 'SMA') 
    data = data.join(SMA) 
    return data
 
# Exponentially-weighted Moving Average 
def EWMA(data, ndays): 
    EMA = pd.Series(pd.ewma(data['Close'], span = ndays, min_periods = ndays - 1), 
    name = 'EWMA_' + str(ndays)) 
    data = data.join(EMA) 
    return data
 
# Retrieve the Nifty data from Yahoo finance:
data = web.DataReader('^NSEI',data_source='yahoo',start='1/1/2013', end='1/1/2016')
data = pd.DataFrame(data) 
close = data['Close']
 
# Compute the 50-day SMA for NIFTY
n = 50
SMA_NIFTY = SMA(data,n)
SMA_NIFTY = SMA_NIFTY.dropna()
SMA = SMA_NIFTY['SMA']
 
# Compute the 200-day EWMA for NIFTY
ew = 200
EWMA_NIFTY = EWMA(data,ew)
EWMA_NIFTY = EWMA_NIFTY.dropna()
EWMA = EWMA_NIFTY['EWMA_200']
 
# Plotting the NIFTY Price Series chart and Moving Averages below
plt.figure(figsize=(9,5))
plt.plot(data['Close'],lw=1, label='NSE Prices')
plt.plot(SMA,'g',lw=1, label='50-day SMA (green)')
plt.plot(EWMA,'r', lw=1, label='200-day EWMA (red)')
plt.legend(loc=2,prop={'size':11})
plt.grid(True)
plt.setp(plt.gca().get_xticklabels(), rotation=30)

The **Rate of Change (ROC)** is a technical indicator that measures the percentage change between the most recent price and the price ânâ dayâs ago. The indicator fluctuates around the zero line.

If the ROC is rising, it gives a bullish signal, while a falling ROC gives a bearish signal. One can compute ROC based on different periods in order to gauge the short-term momentum or the long-term momentum.

In [None]:
# Rate of Change code
 
# Load the necessary packages and modules
import pandas as pd
import pandas.io.data as web
import matplotlib.pyplot as plt
 
# Rate of Change (ROC)
def ROC(data,n):
    N = data['Close'].diff(n)
    D = data['Close'].shift(n)
    ROC = pd.Series(N/D,name='Rate of Change')
    data = data.join(ROC)
    return data 
 
# Retrieve the NIFTY data from Yahoo finance:
data = web.DataReader('^NSEI',data_source='yahoo',start='6/1/2015',end='1/1/2016')
data = pd.DataFrame(data)
 
# Compute the 5-period Rate of Change for NIFTY
n = 5
NIFTY_ROC = ROC(data,n)
ROC = NIFTY_ROC['Rate of Change']
 
# Plotting the Price Series chart and the Ease Of Movement below
fig = plt.figure(figsize=(7,5))
ax = fig.add_subplot(2, 1, 1)
ax.set_xticklabels([])
plt.plot(data['Close'],lw=1)
plt.title('NSE Price Chart')
plt.ylabel('Close Price')
plt.grid(True)
bx = fig.add_subplot(2, 1, 2)
plt.plot(ROC,'k',lw=0.75,linestyle='-',label='ROC')
plt.legend(loc=2,prop={'size':9})
plt.ylabel('ROC values')
plt.grid(True)
plt.setp(plt.gca().get_xticklabels(), rotation=30)

The concept of **Bollinger band** was developed by John Bollinger. These bands comprise of an upper Bollinger band and a lower Bollinger band, and are placed two standard deviations above and below a moving average.

Bollinger bands expand and contract based on the volatility. During a period of rising volatility, the bands widen, and they contract as the volatility decreases. Prices are considered to be relatively high when they move above the upper band and relatively low when they go below the lower band.

In [None]:
################ Bollinger Bands #############################
 
# Load the necessary packages and modules
import pandas as pd
import pandas.io.data as web
 
# Compute the Bollinger Bands 
def BBANDS(data, ndays):
 
    MA = pd.Series(pd.rolling_mean(data['Close'], ndays)) 
    SD = pd.Series(pd.rolling_std(data['Close'], ndays))
 
    b1 = MA + (2 * SD)
    B1 = pd.Series(b1, name = 'Upper BollingerBand') 
    data = data.join(B1) 
  
    b2 = MA - (2 * SD)
    B2 = pd.Series(b2, name = 'Lower BollingerBand') 
    data = data.join(B2) 
 
    return data
 
# Retrieve the Nifty data from Yahoo finance:
data = web.DataReader('^NSEI',data_source='yahoo',start='1/1/2010', end='1/1/2016')
data = pd.DataFrame(data)
 
# Compute the Bollinger Bands for NIFTY using the 50-day Moving average
n = 50
NIFTY_BBANDS = BBANDS(data, n)
print(NIFTY_BBANDS)

### 02-07 Dealing with Data

Adjusted Price

Analysis of historical data is crucial for determining patterns and making economic decisions,
but some things drastically change the price of stocks without having any effect on
the real value of the stock. Dividends and stock splits are two things that do just that.
The adjusted price accounts for these events and corrects the computational problems that
would occur if only the price were taken into account.

Stock splits occur when the price of a stock is too high and the company decides to cut
the price, but increase the volume so that the overall market cap is the same. This is a
problem when dealing with data because itâs seen as a large drop in price. The adjusted
price is calculated going backwards in time; moreover, the given and adjusted price are the
same for a certain starting present day and adjusted going backward in time. If the price
is ever split, say by 3, then at the time of the split, the price is divided by 3 so that there
is no discontinuity.

At the time a company announces the date for payment of dividends, the price of the
stock will increase by the amount of a dividend until theyâre paid at which point the price
rapidly decreases by that amount. This is adjusted looking back in time, and on the day
a dividend is paid, the prices preceding are decreased by the proportion of the dividend
payment.

As a note for machine learning, the data that is chosen for the learner is very important.
If the stocks from today are chosen and analyzed starting from 7 years ago, then those stocks
will of course do well because theyâve survived. Thatâs using a biased strategy, so what needs
to be done is to take index stocks from 7 years ago and run with those. For adjusted price,
itâs also important to note that the adjusted price will be different depending on where the
starting point is chosen, and that should also be taken into account.


###  02-08 Efficient Markets Hypothesis

The Efficient Markets Hypothesis:
        
1. Large number of investors The most important assumption of EMH is that there
are a large number of investors for-profit. They have incentive to find where the price
of a stock is out of line with its true value. Because there are so many investors, any
time new information comes out, the price is going to change accordingly.
2. New information arrives randomly
3. Prices adjust quickly
4. Prices reflect all available information

The Three Basic Forms of the EMH

The efficient market hypothesis assumes that markets are efficient. However, the efficient market hypothesis (EMH) can be categorized into three basic levels:

- **Weak-Form EMH**

The weak-form EMH implies that the market is efficient, reflecting all market information. This hypothesis assumes that the rates of return on the market should be independent; past rates of return have no effect on future rates. Given this assumption, rules such as the ones traders use to buy or sell a stock, are invalid.

- **Semi-Strong EMH**

The semi-strong form EMH implies that the market is efficient, reflecting all publicly available information. This hypothesis assumes that stocks adjust quickly to absorb new information. The semi-strong form EMH also incorporates the weak-form hypothesis. Given the assumption that stock prices reflect all new available information and investors purchase stocks after this information is released, an investor cannot benefit over and above the market by trading on new information.

- **Strong-Form EMH**

The strong-form EMH implies that the market is efficient: it reflects all information both public and private, building and incorporating the weak-form EMH and the semi-strong form EMH. Given the assumption that stock prices reflect all information (public as well as private) no investor would be able to profit above the average investor even if he was given new information.

Weak Form Tests

The tests of the weak form of the EMH can be categorized as:

Statistical Tests for Independence - In our discussion on the weak-form EMH, we stated that the weak-form EMH assumes that the rates of return on the market are independent. Given that assumption, the tests used to examine the weak form of the EMH test for the independence assumption. Examples of these tests are the autocorrelation tests (returns are not significantly correlated over time) and runs tests (stock price changes are independent over time).

Trading Tests - Another point we discussed regarding the weak-form EMH is that past returns are not indicative of future results, therefore, the rules that traders follow are invalid. An example of a trading test would be the filter rule, which shows that after transaction costs, an investor cannot earn an abnormal return.

Semi-strong Form Tests

Given that the semi-strong form implies that the market is reflective of all publicly available information, the tests of the semi-strong form of the EMH are as follows:

Event Tests - The semi-strong form assumes that the market is reflective of all publicly available information. An event test analyzes the security both before and after an event, such as earnings. The idea behind the event test is that an investor will not be able to reap an above average return by trading on an event.

Regression/Time Series Tests - Remember that a time series forecasts returns based historical data. As a result, an investor should not be able to achieve an abnormal return using this method.

Strong-Form Tests

Given that the strong-form implies that the market is reflective of all information, both public and private, the tests for the strong-form center around groups of investors with excess information. These investors are as follows:

Insiders - Insiders to a company, such as senior managers, have access to inside information. SEC regulations forbid insiders for using this information to achieve abnormal returns.

Exchange Specialists - An exchange specialist recalls runs on the orders for a specific equity. It has been found however, that exchange specialists can achieve above average returns with this specific order information.

Analysts - The equity analyst has been an interesting test. It analyzes whether an analyst's opinion can help an investor achieve above average returns. Analysts do typically cause movements in the equities they focus on.

Institutional money managers - Institutional money managers, working for mutual funds, pensions and other types of institutional accounts, have been found to have typically not perform above the overall market benchmark on a consistent basis.

### 02-09 The Fundamental Law of active portfolio management

Richard Grinold was trying to find a way of relating performance, skill, and breadth. For example, you might have lots of skill to pick stocks well, but you might not have the breadth to use that skill. So he developed the following relationship:
    
performance = skillâbreadth

So we need some way of measuring skill and breadth. Performance is summarized by something called the information ratio:

IR = ICâBR 
where IC is the information coefficient, and BR is the number of trading opportunities we have.

The fundamental law states that IR, the ratio of active portfolio return to active portfolio risk, equals the IC times the square root of the BR. IC, the correlation between the manager's forecasted and realized excess returns, measures the active manager's skill. BR is the number of signals behind the manager's forecast, or the number of âbetsâ in the actively managed portfolio.

The original law applies only to a single-period investment, and BR is not measurable. The author's generalized law applies the capital asset pricing model to isolate the investment manager's residual return, which equals forecast return plus forecast error. He makes four assumptions regarding residual returns, forecasts, and forecast errors:  There will be no long-term excess returns, there will be independence between forecasts and forecast errors, there will be conversion of forecasts into portfolio positions according to modern portfolio theory, and there will be a normal distribution of forecasts. These assumptions are explicitly or implicitly present in the original formulation of the fundamental law.

The Sharpe ratio is the industry standard for measuring risk-adjusted return. Sharpe originally developed this ratio as a single-period forecasting tool and named it the reward-to-variability ratio. The Sharpe ratio was designed as an ex ante, or forward-looking, ratio for determining what reward an investor could expect for investing in a risky asset versus a risk-free asset. The numerator of the ratio is the expected portfolio return less the risk-free rate, and the denominator is the portfolioâs expected volatility or standard deviation of returns (less that of the risk-free assetâs standard deviation, which is zero). The resulting ratio isolates the expected excess return that the portfolio could be expected to generate per unit of portfolio return variability. Sharpeâs original version assumed that borrowing at the risk-free rate would finance the investment in the risky asset, a zero-investment strategy.

The information ratio (IR) is often referred to as a variation or generalized version of the Sharpe ratio. It evolved as users of the Sharpe ratio began substituting passive benchmarks for the risk-free rate. The information ratio tells an investor how much excess return is generated from the amount of excess risk taken relative to the benchmark. It is frequently used by investors to set portfolio constraints or objectives for their managers, such as tracking risk limits or attaining a minimum information ratio.

The Sharpe ratio measures a portfolioâs excess return to its total risk; it answers the question of how much an investor was compensated for investing in a risky asset versus a risk-free asset. All portfolios measured with the Sharpe ratio, then, have the same benchmark: the risk-free asset. The information ratio measures a portfolioâs excess return relative to its benchmark tracking error. It answers the question of how much reward a manager generated in relation to the risks he or she took deviating from the benchmark. The information ratio is used for measuring active managers against a passive benchmark.

In [5]:
# Quiz: Simons Vs. Buffet
import math
(math.sqrt(120)/0.001)**2

120000000.00000001

### 02-10 Portfolio optimization and the efficient frontier

In [None]:
%%R 
library(PortfolioAnalytics)
library(quantmod)
library(PerformanceAnalytics)

**Modern portfolio theory (MPT)**, or mean-variance analysis, is a mathematical framework for assembling a portfolio of assets such that the expected return is maximized for a given level of risk, defined as variance. Its key insight is that an asset's risk and return should not be assessed by itself, but by how it contributes to a portfolio's overall risk and return.

PO means is minimizing risk for a given target return. Risk is largely defined as the volatility of a stock. A portfolio is composed of some stocks that individually have their own return-risk ratios, but it is possible to weight them such that the return-risk ratio of the portfolio is higher than that of any individual stock.

This is done through combining correlated and anti-correlated stocks to highly reduce volatility. In the case of a highly correlated group of stocks, their combination results in a similar volatility, but if theyâre combined with highly anti-correlated stocks, then with accurate weighting, fluctuations cancel out and volatility is minimal while yielding similar returns. A useful algorithm to find the best weighting is mean variance optimization (MVO). *This algorithm is not explained, but we should find it*. MVO and similar algorithms find the minimal risk for a given target return, and if this is plotted over all target returns, we get a curve called the efficient frontier. On a return-risk plot, a line tangent to the efficient frontier with an intercept at the origin also points to the portfolio with the minimal Sharpe ratio.