Welcome to your DataCamp project audition! This notebook must be filled out and vetted before a contract can be signed and you can start creating your project.

The first step is forking the repository in which this notebook lives. After that, there are two parts to be completed in this notebook:

- **Project information**:  The title of the project, a project description, etc.

- **Project introduction**: The three first text and code cells that will form the introduction of your project.

When complete, please email the link to your forked repo to projects@datacamp.com with the email subject line _DataCamp project audition_. If you have any questions, please reach out to projects@datacamp.com.

# Project information

**Project title**: **Algorithmic Trading with Python** Maximum 41 characters.

**Name:** **Harshit Tyagi**

**Email address associated with your DataCamp account:** harshit.bvcoe@gmail.com

**GitHub username:** harshitcodes.

**Project description**: In this project, we’ll learn the fundamentals of quantitative analysis, from data processing to backtesting your strategy. We will use Python to work with historical stock data, develop trading strategies based on the momentum indicator. Performing a statistical test on the mean of the returns to conclude if there is alpha in the signal.

- Build upon the fundamentals of statistics, linear algebra and calculus to predict the returns of stock prices in the your portfolio. Backtest the algorithm/strategy to see it perform on the present data. Learn the art of comprehending risk metrics in the quant world like Alpha, Beta, Sharpe ratio, Volatility and Benchmark returns.
- Here are the pre-requirsites to get the most out of this project: 
    Fundamentals of Python Programming, Pandas, Matplotlib
    Basics of Statistics(Mean, variance, p-value, standard deviation, etc), Linear Algebra.
    Basics of finance (you can read Python for finance for starters)
   
- Here is the link to the End of Day stock prices of Apple(company) which are extracted from quandle
    [https://drive.google.com/open?id=1qwVGElCrE5X7M3L_MhAjtJ0mcs0AOacz](https://drive.google.com/open?id=1qwVGElCrE5X7M3L_MhAjtJ0mcs0AOacz)

# Project introduction

***Note: nothing needs to be filled out in this cell. It is simply setting up the template cells below.***

The final output of a DataCamp project looks like a blog post: pairs of text and code cells that tell a story about data. The text is written from the perspective of the data analyst and *not* from the perspective of an instructor on DataCamp. So, for this blog post intro, all you need to do is pretend like you're writing a blog post -- forget the part about instructors and students.

Below you'll see the structure of a DataCamp project: a series of "tasks" where each task consists of a title, a **single** text cell, and a **single** code cell. There are 8-12 tasks in a project and each task can have up to 10 lines of code. What you need to do:
1. Read through the template structure.
2. As best you can, divide your project as it is currently visualized in your mind into tasks.
3. Fill out the template structure for the first three tasks of your project.

As you are completing each task, you may wish to consult the project notebook format in our [documentation](https://instructor-support.datacamp.com/projects/datacamp-projects-jupyter-notebook). Only the `@context` and `@solution` cells are relevant to this audition.

## 1. Loading the data into dataframe

We are going to analyse the end of day stock prices of the Apple company and apply a stock trading strategy called on the data. Writing an algorithm which can create money has always been the fight in the quantitative world.

We'll be evaluating the returns from the stock prices for different sub strategies and backtest the results using basic statistical concepts like p-value, sharpe ratio etc. Based on the results of the strategy and the risk assessment of the strategy we'll be in a position to make the trade in the live market.

Momentum trading focus on acceleration in a stock’s price or in a company’s earnings or revenues. We can then take on a long or short position in the stock, in the hopes that the momentum will continue in the same direction and provide good returns on our investment.


An exciting intro to the analysis. Provide context on the problem you're going to solve, the dataset(s) you're going to use, the relevant industry, etc. You may wish to briefly introduce the techniques you're going to use. Tell a story to get students excited! It should at most have 1200 characters.

![alt text](https://github.com/adam-p/markdown-here/raw/master/src/common/images/icon48.png)

The most common error instructors make in **context cells** is referring to the student or the project. We want project notebooks to appear as a blog post or a data analysis. Bad: *"In this project, you will..."* Good: *"In this notebook, we will..."*

The first task in projects often involve loading data. Please store any data files you use in the `datasets/` folder in this repository.

Images are welcome additions to every Markdown cell, but especially this first one. Make sure the images you use have a [permissive license](https://support.google.com/websearch/answer/29508?hl=en) and display them using [Markdown](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet#images). Store your images in the `img/` folder in this repository.

In [1]:
# Importing the required python packages for reading, 
# manipulating and plotting the data

import numpy as np
import pandas as pd

import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns; sns.set()  # 1

stock_df = pd.read_csv('./datasets/apple_stock_eod_prices.csv')
stock_df.head()

ModuleNotFoundError: No module named 'numpy'

## 2. Exploratory data analysis

Exploratory data analysis (EDA) is a crucial component of data science which allows you to develop the gist of what your data look like and what kinds of questions might be answered by them.

We'll figure out what all columns are worth analysing and what's not worth our time.
1. Cleansing: Checking for problems with the collected data, such as missing data or measurement error, data type of columns, etc.)
2. Normality check: First, log-normality: if we assume that prices are distributed log normally (which, in practice, may or may not be true for any given price series). This is handy given much of classic statistics presumes normality.
3. After analysing the data, we see that log returns will give us normal distribution which will help us backtest our strategy using statistics.

The most common error instructors make in **context cells** is referring to the student or the project. We want project notebooks to appear as a blog post or a data analysis. Bad: *"In this task, you will..."* Good: *"Next, we will..."*

In [4]:
# print the stock_df dataframe information
# 1. Check for the datatypes of all the columns,
# 2. no column should have an object datatype
# 3. Check the index of the dataframe, should be a datetime index 
# because this is a time series analysis
print(stock_df.info())

# Normality check by plotting the data:
stock_data.plot()

# Adding a returns column in the dataframe which will record the log returns
# by dividing the closing price of day i by closing price of day (i-1)
# Use the shift function in pandas to calculate logarithmic return

stock_df['returns'] = np.log(stock_df['close'] / stock_df['close'].shift(1))  # 12

NameError: name 'stock_df' is not defined

## 3. Formalize the strategy

The strategy rules are as follows:

1. Yesterday must have been a low day with a drop of at least 0.25%
2. If AAPL opens down by more than 0.1% today, go long and exit on close.


In [3]:
# Code and comments for the third task
# It should consist of up to 10 lines of code (not including comments)
# and take at most 10 seconds to execute on an average laptop.

# To backtest the strategy we have created a new dataframe 3 new columns

strat_data = pd.DataFrame(index=data.index)
strat_data['cc'] = data['Close'].pct_change()*100   # close to close change in percent
strat_data['co'] = (data['Open']/data['Close'].shift(1)-1)*100   # previous close to open in %
strat_data['oc'] = (data['Close']/data['Open']-1)*100   # open to close change in percent
strat_data.head()
strat_data.plot()

*Stop here! Only the three first tasks. :)*