## Crypto Arbitrage

In this Challenge, you'll take on the role of an analyst at a high-tech investment firm. The vice president (VP) of your department is considering arbitrage opportunities in Bitcoin and other cryptocurrencies. As Bitcoin trades on markets across the globe, can you capitalize on simultaneous price dislocations in those markets by using the powers of Pandas?

For this assignment, you’ll sort through historical trade data for Bitcoin on two exchanges: Bitstamp and Coinbase. Your task is to apply the three phases of financial analysis to determine if any arbitrage opportunities exist for Bitcoin.

This aspect of the Challenge will consist of 3 phases.

1. Collect the data.

2. Prepare the data.

3. Analyze the data. 



###  Import the required libraries and dependencies.

In [None]:
import pandas as pd
from pathlib import Path
%matplotlib inline

## Collect the Data

To collect the data that you’ll need, complete the following steps:

Instructions. 

1. Using the Pandas `read_csv` function and the `Path` module, import the data from `bitstamp.csv` file, and create a DataFrame called `bitstamp`. Set the DatetimeIndex as the Timestamp column, and be sure to parse and format the dates.

2. Use the `head` (and/or the `tail`) function to confirm that Pandas properly imported the data.

3. Repeat Steps 1 and 2 for `coinbase.csv` file.

### Step 1: Using the Pandas `read_csv` function and the `Path` module, import the data from `bitstamp.csv` file, and create a DataFrame called `bitstamp`. Set the DatetimeIndex as the Timestamp column, and be sure to parse and format the dates.

In [None]:
# Read in the CSV file called "bitstamp.csv" using the Path module. 
# The CSV file is located in the Resources folder.
# Set the index to the column "Date"
# Set the parse_dates and infer_datetime_format parameters
bitstamp = pd.read_csv(
    Path("Resources/bitstamp.csv"),
    index_col="Timestamp",
    parse_dates=True,
    infer_datetime_format=True
)


### Step 2: Use the `head` (and/or the `tail`) function to confirm that Pandas properly imported the data.

In [None]:
# Use the head (and/or tail) function to confirm that the data was imported properly.
bitstamp.head(), bitstamp.tail()

### Step 3: Repeat Steps 1 and 2 for `coinbase.csv` file.

In [None]:
# Read in the CSV file called "coinbase.csv" using the Path module. 
# The CSV file is located in the Resources folder.
# Set the index to the column "Timestamp"
# Set the parse_dates and infer_datetime_format parameters


coinbase = pd.read_csv(
    Path("Resources/coinbase.csv"),
    index_col="Timestamp",
    parse_dates=True,
    infer_datetime_format=True
)

In [None]:
# Use the head (and/or tail) function to confirm that the data was imported properly.
coinbase.head(),coinbase.tail()

## Prepare the Data

To prepare and clean your data for analysis, complete the following steps:

1. For the bitstamp DataFrame, replace or drop all `NaN`, or missing, values in the DataFrame.

2. Use the `str.replace` function to remove the dollar signs ($) from the values in the Close column.

3. Convert the data type of the Close column to a `float`.

4. Review the data for duplicated values, and drop them if necessary.

5. Repeat Steps 1–4 for the coinbase DataFrame.

### Step 1: For the bitstamp DataFrame, replace or drop all `NaN`, or missing, values in the DataFrame.

In [None]:
# For the bitstamp DataFrame, replace or drop all NaNs or missing values in the DataFrame
bitstamp.isnull().sum()

In [None]:
bitstamp = bitstamp.dropna().copy()

### Step 2: Use the `str.replace` function to remove the dollar signs ($) from the values in the Close column.

In [None]:
# Use the str.replace function to remove the dollar sign, $
bitstamp["Close"] = bitstamp["Close"].str.replace("$", "")

### Step 3: Convert the data type of the Close column to a `float`.

In [None]:
# Convert the Close data type to a float
bitstamp.dtypes

In [None]:
bitstamp["Close"] = bitstamp["Close"].astype("float")

### Step 4: Review the data for duplicated values, and drop them if necessary.

In [None]:
# Review the data for duplicate values, and drop them if necessary
bitstamp.duplicated().sum()

In [None]:
 # final data review for bitstamp
bitstamp.isnull().sum(), bitstamp.dtypes, bitstamp.duplicated().sum()

### Step 5: Repeat Steps 1–4 for the coinbase DataFrame.

In [None]:
# Repeat Steps 1–4 for the coinbase DataFrame

# Step 1 - replace or drop NaN's

coinbase.isnull().sum()

In [None]:
coinbase = coinbase.dropna().copy()

In [None]:
# Step 2 - remove the "$"s

coinbase["Close"] = coinbase["Close"].str.replace("$", "")

In [None]:
# Step 3 - convert 'Close' column data to 'float' 

coinbase["Close"] = coinbase["Close"].astype("float")

In [None]:
# Step 4 - remove any duplicates

coinbase.duplicated().sum()

In [None]:
# final data review for coinbase 
coinbase.isnull().sum(), coinbase.dtypes, coinbase.duplicated().sum()

## Analyze the Data

Your analysis consists of the following tasks: 

1. Choose the columns of data on which to focus your analysis.

2. Get the summary statistics and plot the data.

3. Focus your analysis on specific dates.

4. Calculate the arbitrage profits.

### Step 1: Choose columns of data on which to focus your analysis.

Select the data you want to analyze. Use `loc` or `iloc` to select the following columns of data for both the bitstamp and coinbase DataFrames:

* Timestamp (index)

* Close


In [None]:
# Use loc or iloc to select `Timestamp (the index)` and `Close` from bitstamp DataFrame
# shortened name of df named in starter code to 'btsp'

btsp = bitstamp_sliced = bitstamp.loc[:,['Close']]

# Review the first five rows of the DataFrame
btsp.head()

In [None]:
# Use loc or iloc to select `Timestamp (the index)` and `Close` from coinbase DataFrame
# shortened name of df named in starter code to 'coin'

coin = coinbase_sliced = coinbase.loc[:,['Close']]

# Review the first five rows of the DataFrame
coin.head()

### Step 2: Get summary statistics and plot the data.

Sort through the time series data associated with the bitstamp and coinbase DataFrames to identify potential arbitrage opportunities. To do so, complete the following steps:

1. Generate the summary statistics for each DataFrame by using the `describe` function.

2. For each DataFrame, create a line plot for the full period of time in the dataset. Be sure to tailor the figure size, title, and color to each visualization.

3. In one plot, overlay the visualizations that you created in Step 2 for bitstamp and coinbase. Be sure to adjust the legend and title for this new visualization.

4. Using the `loc` and `plot` functions, plot the price action of the assets on each exchange for different dates and times. Your goal is to evaluate how the spread between the two exchanges changed across the time period that the datasets define. Did the degree of spread change as time progressed?

In [None]:
# Generate the summary statistics for the bitstamp DataFrame
btsp.describe()

In [None]:
# Generate the summary statistics for the coinbase DataFrame
coin.describe()

In [None]:
# Create a line plot for the bitstamp DataFrame for the full length of time in the dataset 
# Be sure that the figure size, title, and color are tailored to each visualization

# found a 'web safe' color template # from https://www.colorhexa.com/web-safe-colors

btsp['Close'].plot(
    figsize=(12,6),
    legend=True,
    label='BTC',
    ylabel="BTC Price (USD$)",
    title="Bitstamp Exchange: 2018 - Q1",
    color="#666600" 
) 

In [None]:
# Create a line plot for the coinbase DataFrame for the full length of time in the dataset 
# Be sure that the figure size, title, and color are tailored to each visualization

# set color to '#ff9900' for visibility alone and good contrast to 'dark green' # https://www.colorhexa.com/web-safe-colors

coin['Close'].plot(
    figsize=(12,6), 
    legend=True,
    label='BTC',
    ylabel="BTC Price (USD$)", 
    title="Coinbase Excgange: 2018 - Q1",
    color="#ff9900"
)

In [None]:
# Overlay the visualizations for the bitstamp and coinbase DataFrames in one plot
# The plot should visualize the prices over the full lenth of the dataset
# Be sure to include the parameters: legend, figure size, title, and color and label

ax=btsp['Close'].plot(
    figsize=(18,6),
    legend=True,
    ylabel="BTC Price (USD$)",
    title="BTC Arbitrage Opportunities: Q1 2018 on BTSP and COIN",
    color="#666600",
    label="BTC on Bitstamp"
)
coin['Close'].plot(
    ax=ax,
    legend=True,
    color="#ff9900", 
    label="BTC on Coinbase"
)

In [None]:
### EARLY WINDOW ###
# Using the loc and plot functions, create an overlay plot that visualizes 
# the price action of both DataFrames for a one month period early in the dataset
# Be sure to include the parameters: legend, figure size, title, and color and label

bitstamp['Close'].loc['2018-01-20' : '2018-02-18'].plot(
    legend=True,
    figsize=(18,6),
    ylabel="BTC Price (USD$)",
    title="BTC Arbitrage Opportunities (early window): BTSP and COIN",
    color="#666600",
    label="BTC on Bitstamp"
)
coinbase['Close'].loc['2018-01-20' : '2018-02-18'].plot(
    legend=True,
    color="#ff9900",
    label="BTC on Coinbase"
)

In [None]:
### MIDDLE WINDOW ###
# Using the loc and plot functions, create an overlay plot that visualizes 
# the price action of both DataFrames for a one month period early in the dataset
# Be sure to include the parameters: legend, figure size, title, and color and label

bitstamp['Close'].loc['2018-02-10' : '2018-03-10'].plot(
    legend=True,
    figsize=(18,6),    
    ylabel="BTC Price (USD$)",
    title="BTC Arbitrage Opportunities (early window): BTSP and COIN",
    color="#666600",
    label="BTC on Bitstamp")

coinbase['Close'].loc['2018-02-10' : '2018-03-10'].plot(
    legend=True,
    color="#ff9900",
    label="BTC on Coinbase"
)

In [None]:
 ### LATER WINDOW ###
# Using the loc and plot functions, create an overlay plot that visualizes
# the price action of both DataFrames for a one month period later in the dataset
# Be sure to include the parameters: legend, figure size, title, and color and label 

bitstamp['Close'].loc['2018-02-20' : '2018-03-20'].plot(
    legend=True,
    figsize=(18,6),
    title="BTC Arbitrage Opportunities (later window): BTSP and COIN",
    color="#666600",
    label="BTC on Bitstamp")

coinbase['Close'].loc['2018-02-20' : '2018-03-20'].plot(
    legend=True,
    figsize=(18,6),
    ylabel="BTC Price (USD$)",
    color="#ff9900",
    label="BTC on Coinbase")

**Question** Based on the visualizations of the different time periods, has the degree of spread change as time progressed?

**Answer** Yes.  At the beginning of Q1 2018, BTC traded in close range with little spread.  By the end of Jan. 2018, BTC price showed a significantly  wide spread that generated an arbitrage opportunity based on the visual alone.  Then in Feb. to March 2018, the spread appears to maintain a narrow spread.

### Step 3: Focus Your Analysis on Specific Dates

Focus your analysis on specific dates by completing the following steps:

1. Select three dates to evaluate for arbitrage profitability. Choose one date that’s early in the dataset, one from the middle of the dataset, and one from the later part of the time period.

2. For each of the three dates, generate the summary statistics and then create a box plot. This big-picture view is meant to help you gain a better understanding of the data before you perform your arbitrage calculations. As you compare the data, what conclusions can you draw?

In [None]:
# 1. Select three dates to evaluate for arbitrage profitability ### doing this first makes it easier to follow 'DRY' coding ###

### Jan. 28 - early date #### BTSP > COIN
early_btsp = btsp.loc[ 
    "2018-01-28":"2018-01-28"
]
early_coin = coin.loc[
    "2018-01-28":"2018-01-28"
]

### Feb. 19 - middle date #### BTSP > COIN
mid_btsp = btsp.loc[
    "2018-02-19":"2018-02-19"
]
mid_coin = coin.loc[
    "2018-02-19":"2018-02-19"
]

### Mar. 14 - late date #### BTSP > COIN
late_btsp = btsp.loc[
    "2018-03-14":"2018-03-14"
]
late_coin = coin.loc[
    "2018-03-14":"2018-03-14"
]

In [None]:
### EARLY WINDOW SAMPLE DATE ###

# Create an overlay plot that visualizes the two dataframes over a period of one day early in the dataset. 
# Be sure that the plots include the parameters `legend`, `figsize`, `title`, `color` and `label` 

ax=early_btsp['Close'].plot(
    figsize=(18,6),
    legend=True, 
    label="Bitstamp",
    ylabel="Price (USD$)",
    title="Early BTC price spread January 28, 2018: BTSP and COIN",
    color="#666600"
)

early_coin['Close'].plot(
    ax=ax,
    legend=True,
    label="Coinbase",
    color="#ff9900"
)

In [None]:
# Using the early date that you have selected, calculate the arbitrage spread 
# by subtracting the bitstamp lower closing prices from the coinbase higher closing prices

# BTSP > COIN - for BTC
arb_sprd_0128 = early_btsp['Close'] - early_coin['Close']

# Generate summary statistics for the early DataFrame
arb_sprd_0128.describe()


In [None]:
# Visualize the arbitrage spread from early in the dataset in a box plot
arb_sprd_0128.plot(
    kind="box",
    figsize=(10,4),
    title="Early BTC Arbitrage Spread: Jan. 28, 2018"
)

In [None]:
### MIDDLE WINDOW SAMPLE DATE ###
# Create an overlay plot that visualizes the two dataframes over a period of one day from the middle of the dataset. 
# Be sure that the plots include the parameters `legend`, `figsize`, `title`, `color` and `label` 

ax=mid_btsp['Close'].plot(
    legend=True,
    label="Bitstamp",
    figsize=(18,6),
    ylabel="BTC Price (USD$)",
    title="Middle BTC price spread - February 19, 2018: BTSP and COIN",
    color="#666600"
)                      
mid_coin['Close'].plot(
    ax=ax,
    legend=True,
    label="Coinbase",
    color="#ff9900"
)

In [None]:
# Using the date in the middle that you have selected, calculate the arbitrage spread 
# by subtracting the bitstamp lower closing prices from the coinbase higher closing prices

# BTSP > COIN
arb_sprd_0219 = mid_btsp['Close'] - mid_coin['Close']

# Generate summary statistics 
arb_sprd_0219.describe()

In [None]:
# Visualize the arbitrage spread from the middle of the dataset in a box plot
arb_sprd_0219.plot(
    kind="box",
    legend=True,
    figsize=(10,4),
    title="Middle BTC Arbitrage Spread: Feb. 19, 2018"
)

In [None]:
# Create an overlay plot that visualizes the two dataframes over a period of one day from the middle of the dataset. 
# Be sure that the plots include the parameters `legend`, `figsize`, `title`, `color` and `label` 

ax=late_btsp['Close'].plot(
    legend=True,
    label="Bitstamp",
    figsize=(18,6),
    ylabel="BTC Price (USD$)",
    title="Late BTC price spread - March 14, 2018: BTSP and COIN",
    color="#666600")
                      
late_coin['Close'].plot(
    ax=ax,
    legend=True,
    label="Coinbase",
    color="#ff9900")

In [None]:
# Using the date in the later third of df that you have selected, calculate the arbitrage spread 
# by subtracting the bitstamp lower closing prices from the coinbase higher closing prices

# BTSP > COIN
arb_sprd_0314 = late_btsp['Close'] - late_coin['Close']

# Generate summary statistics 
arb_sprd_0314.describe()

In [None]:
# Visualize the arbitrage spread from late in the dataset in a box plot
arb_sprd_0314.plot(
    kind="box",
    legend=True,
    figsize=(10,4),
    title="Late BTC Arbitrage Spread: Mar. 14, 2018"
)

### Step 4: Calculate the Arbitrage Profits

Calculate the potential profits for each date that you selected in the previous section. Your goal is to determine whether arbitrage opportunities still exist in the Bitcoin market. Complete the following steps:

1. For each of the three dates, measure the arbitrage spread between the two exchanges by subtracting the lower-priced exchange from the higher-priced one. Then use a conditional statement to generate the summary statistics for each arb_sprd DataFrame, where the spread is greater than zero.

2. For each of the three dates, calculate the spread returns. To do so, divide the instances that have a positive arbitrage spread (that is, a spread greater than zero) by the price of Bitcoin from the exchange you’re buying on (that is, the lower-priced exchange). Review the resulting DataFrame.

3. For each of the three dates, narrow down your trading opportunities even further. To do so, determine the number of times your trades with positive returns exceed the 1% minimum threshold that you need to cover your costs.

4. Generate the summary statistics of your spread returns that are greater than 1%. How do the average returns compare among the three dates?

5. For each of the three dates, calculate the potential profit, in dollars, per trade. To do so, multiply the spread returns that were greater than 1% by the cost of what was purchased. Make sure to drop any missing values from the resulting DataFrame.

6. Generate the summary statistics, and plot the results for each of the three DataFrames.

7. Calculate the potential arbitrage profits that you can make on each day. To do so, sum the elements in the profit_per_trade DataFrame.

8. Using the `cumsum` function, plot the cumulative sum of each of the three DataFrames. Can you identify any patterns or trends in the profits across the three time periods?

(NOTE: The starter code displays only one date. You'll want to do this analysis for two additional dates).

#### 1. For each of the three dates, measure the arbitrage spread between the two exchanges by subtracting the lower-priced exchange from the higher-priced one. Then use a conditional statement to generate the summary statistics for each arb_sprd DataFrame, where the spread is greater than zero.

*NOTE*: For illustration, only one of the three dates is shown in the starter code below.

In [None]:
# For the date early in the dataset, measure the arbitrage spread between the two exchanges
# by subtracting the lower-priced exchange from the higher-priced one

# Use a conditional statement to generate the summary statistics for each arb_sprd DataFrame
arb_sprd_0128[arb_sprd_0128 > 0].describe(),arb_sprd_0219[arb_sprd_0219 > 0].describe(),arb_sprd_0314[arb_sprd_0314 > 0].describe()


#### 2. For each of the three dates, calculate the spread returns. To do so, divide the instances that have a positive arbitrage spread (that is, a spread greater than zero) by the price of Bitcoin from the exchange you’re buying on (that is, the lower-priced exchange). Review the resulting DataFrame.

In [None]:
# For the date early in the dataset, calculate the spread returns by dividing the instances when the arbitrage spread is positive (> 0) 
# by the price of Bitcoin from the exchange you are buying on (the lower-priced exchange).

# spread_return_0128 = arb_sprd_0128[arb_sprd_0128>0] / early_coin['Close']
spread_return_0128 = arb_sprd_0128[arb_sprd_0128>0] / early_coin['Close']

# spread_return_0219 = arb_sprd_0219[arb_sprd_0219>0] / mid_coin['Close']
spread_return_0219 = arb_sprd_0219[arb_sprd_0219>0] / mid_coin['Close']

# spread_return_0314 = arb_sprd_0314[arb_sprd_0314>0] / late_coin['Close']
spread_return_0314 = arb_sprd_0314[arb_sprd_0314>0] / late_coin['Close']


# Review the spread return DataFrame
display(spread_return_0128.head(10), spread_return_0128.tail(10))
display(spread_return_0219.head(10), spread_return_0219.tail(10))
display(spread_return_0314.head(10), spread_return_0314.tail(10))

#### 3. For each of the three dates, narrow down your trading opportunities even further. To do so, determine the number of times your trades with positive returns exceed the 1% minimum threshold that you need to cover your costs.

In [None]:
# For the date early in the dataset, determine the number of times your trades with positive returns 
# exceed the 1% minimum threshold (.01) that you need to cover your costs
profitable_trading_0128 = spread_return_0128[spread_return_0128 > .01]
profitable_trading_0219 = spread_return_0219[spread_return_0219 > .01]
profitable_trading_0314 = spread_return_0314[spread_return_0314 > .01]

# Review the first five profitable trades
profitable_trading_0128.head(), profitable_trading_0219.head(), profitable_trading_0314.head()

#### 4. Generate the summary statistics of your spread returns that are greater than 1%. How do the average returns compare among the three dates?

In [None]:
# For the date early in the dataset, generate the summary statistics for the profitable trades
# or you trades where the spread returns are are greater than 1%
profitable_trading_0128.describe(), profitable_trading_0219.describe(), profitable_trading_0314.describe()

#### 5. For each of the three dates, calculate the potential profit, in dollars, per trade. To do so, multiply the spread returns that were greater than 1% by the cost of what was purchased. Make sure to drop any missing values from the resulting DataFrame.

In [None]:
# For the date early in the dataset, calculate the potential profit per trade in dollars 
# Multiply the profitable trades by the cost of the Bitcoin that was purchased

profit_0128 = profitable_trading_0128 * early_coin
profit_0219 = profitable_trading_0219 * mid_coin
profit_0314 = profitable_trading_0314 * late_coin

# Drop any missing values from the profit DataFrame
profit_0128 = profitable_trading_0128.dropna().copy()
profit_0219 = profitable_trading_0219.dropna().copy()
profit_0314 = profitable_trading_0314.dropna().copy()


# View the early profit DataFrame
profit_0128.head(), profit_0219.head(),  profit_0314.head()


#### 6. Generate the summary statistics, and plot the results for each of the three DataFrames.

In [None]:
#  Generate the summary statistics for the early profit per trade DataFrame

profit_0128.describe(), profit_0219.describe(), profit_0314.describe()

In [None]:
# Plot the results for the early profit per trade DataFrame
profit_0128.plot(
    figsize=(20, 8),
    legend=True,
    label="BTC Arbitrage Profit",
    ylabel="Profit (USD$)",
    title="Potential Profit Per Trade - Early Sample Date",
    color="#669900")

In [None]:
profit_0219.plot(
    figsize=(20, 8),
    legend=True,
    label="BTC Arbitrage Profit",
    ylabel="Profit (USD$)",
    title="Potential Profit Per Trade - Middle Sample Date",
    color="#669900")

In [None]:
profit_0314.plot(
    figsize=(20,8),
    legend=True,
    label="BTC Arbitrage Profit",
    ylabel="Profit (USD$)",
    title="Potential Profit Per Trade - Later Sample Date",
    color="#669900")

#### 7. Calculate the potential arbitrage profits that you can make on each day. To do so, sum the elements in the profit_per_trade DataFrame.

In [None]:
# Calculate the sum of the potential profits for the early profit per trade DataFrame
profit_0128.sum(), profit_0219.sum(), profit_0314.sum()

#### 8. Using the `cumsum` function, plot the cumulative sum of each of the three DataFrames. Can you identify any patterns or trends in the profits across the three time periods?

In [None]:
# Use the cumsum function to calculate the cumulative profits over time for the early profit per trade DataFrame
cumulative_profit_0128 = profit_0128.cumsum()
cumulative_profit_0219 = profit_0219.cumsum()
cumulative_profit_0314 = profit_0314.cumsum()

In [None]:
# Plot the cumulative sum of profits for the early profit per trade DataFrame
cumulative_profit_0128.plot(
    figsize=(20, 8),
    legend=True,
    label="Early date - Maximum Potential Arbitrage Profit", 
    ylabel="Profit (USD$)",
    title="Cumulative Profit - Early Sample Date",
    color="#669900")

In [None]:
# Plot the cumulative sum of profits for the middle profit per trade DataFrame
cumulative_profit_0219.plot(
    figsize=(20, 8),
    legend=True,
    label="Middle date - Maximum Potential Arbitrage Profit", 
    ylabel="Profit (USD$)",
    title="Cumulative Profit - Middle Sample Date",
    color="#669900")

In [None]:
# Plot the cumulative sum of profits for the later profit per trade DataFrame
cumulative_profit_0314.plot(
    figsize=(20, 8),
    legend=True,
    label="Later date - Maximum Potential Arbitrage Profit", 
    ylabel="Profit (USD$)",
    title="Cumulative Profit - Later Sample Date",
    color="#669900"
)

**Question:** After reviewing the profit information across each date from the different time periods, can you identify any patterns or trends?
    
**Answer:** Yes, on both early and later sample dates, there are measurable profits and cumulative profits. The data supports the statement that a profit margin from exclusively buying on one exchange and selling on another exchange only works when there is a spread that creates positive arbitrage 

Early Sample: Jan. 28 - most if not all trades were profitable throughout the day
Later Sample: Mar. 14 - during 12:31pm to 12:34pm was the timeframe for profitable trades that day and cumulative profit is measurable.

Feb. 19 - which was is middle sample date, there are instances where a positive trade outcome was possible, however, none of those trades also covered the 1% cost to trade.  Therefore the middle date was not a profitable day for arbitrage of BTC between COIN and BTSP exchanges if buying BTC on COIN exchange and selling on BTSP exchange.  Given this tight trading range, the output for the cumulative graph was empty and the x-axis did not populate valid "Timestamp" values either. 

# EXTRA VISUALS AND COMPARISONS
```
adding a few more visuals to demonstrate observations...
```


In [None]:
# close up on February visually confirms that trading range was tight

bitstamp['Close'].loc['2018-02-01' : '2018-02-28'].plot(
    figsize=(18,6),
    legend=True,
    ylabel="BTC Price (USD$)",
    title="BTC Arbitrage - Exhibit A: BTSP and COIN - Feb. 2018",
    color="#666600",
    label="Bitstamp"
)
coinbase['Close'].loc['2018-02-01' : '2018-02-28'].plot(
    figsize=(18,6),
    legend=True,
    color="#ff9900",
    label="Coinbase"
)

# The most profitable arbitrage window for Q1 2018 was January 27-29:
### *a visual analysis of the dataframes reveals widespread arbitrage opportunities*...
### 'BUY' *BTC* on COIN exchange, then 'SELL' the *BTC* on the BTSP exchange.

In [None]:
### Jan. 27 to Jan. 29 #### BTSP > COIN

widespread_btsp = btsp.loc[ 
    "2018-01-27":"2018-01-29"
]
widespread_coin = coin.loc[
    "2018-01-27":"2018-01-29"
]

In [None]:
# January 27-29 - largest visual separation for 'BTC' arbitrage 

ax=widespread_btsp.plot(
    legend=True,
    figsize=(18,6),
    ylabel="BTC Price (USD$)",
    title="Best 3-day Window of BTC Arbitrage Opportunites - Q1 2018: BTSP and COIN",
    color="#666600", 
    label="BTSP",
    )
widespread_coin.plot(
    ax=ax,
    legend=True,
    color="#ff9900",
    label="COIN"
    )

In [None]:
# Using the date range that you have selected, calculate the arbitrage spread for 'widespread_arb_window'
# by subtracting the lower-priced exchange from the higher-priced one

# BTSP > COIN - for BTC
widespread_arb_window = widespread_btsp['Close'] - widespread_coin['Close']

In [None]:
widespread_arb_window

In [None]:
# Visualize the arbitrage spread from late in the dataset in a box plot
widespread_arb_window.plot(
    kind="box",
    legend=True,
    figsize=(10,4),
    title="Maximum Profit Opportunity Window - BTC Arbitrage Spread: Jan.27-29, 2018"
)

In [None]:
widespread_return = widespread_arb_window[widespread_arb_window > 0] / widespread_coin['Close']

In [None]:
widespread_return.head(), widespread_return.tail()

In [None]:
widespread_return.describe()

In [None]:
widespread_profitable_trades = widespread_return[widespread_return > .01]

In [None]:
widespread_profitable_trades

In [None]:
widespread_profitable_trades.describe()

In [None]:
widespread_profits = widespread_profitable_trades * widespread_coin['Close']

In [None]:
widespread_profits = widespread_profits.dropna()

In [None]:
widespread_profits.describe()

In [None]:
# Plot results for the 'widespread_profits' 
widespread_profits.plot(
    figsize=(20, 8),
    legend=True,
    label="BTC - Maximum Potential Arbitrage Profit", 
    ylabel="Profit (USD$)",
    title="Cumulative Profit - Jan.27 to Jan. 29",
    color="#669900")

In [None]:
widespread_profits.sum()

In [None]:
cumulative_widespread_profit = widespread_profits.cumsum()

In [None]:
cumulative_widespread_profit.plot(
    figsize=(20, 8),
    legend=True,
    label="'BTC' Cumulative Maximum Potential Arbitrage Profit", 
    ylabel="Arbitrage Profit (USD$)", 
    title="Cumulative Profit - Jan.27 to Jan. 29",
    color="#669900")