## Technical analysis tool

With this dataset, we could perform a wide range of analyses and visualizations to understand how the stock price has changed over time, and to identify trends or patterns in the data. So, I will start with few basic things with these datas over time and compare them with each other.

<img src="TCS.png" width=750 height=400 />

1. **Calculate returns:** 
We can use pandas to calculate the daily returns of the stock, which can be useful for analysis and modeling:

    ```
    # calculate daily returns
    df['Return'] = df['Close'].pct_change()
    # plot the returns over time
    plt.plot(df['Date'], df['Return'])
    plt.title('Deutsche Bank Daily Returns')
    plt.xlabel('Date')
    plt.ylabel('Return')
    plt.show()
    ```

    `pct_change()` is a pandas function that calculates the percentage change between the current and a prior element. It is often used to calculate the daily or periodical returns of a financial asset.

    The formula used to calculate the percentage change for a given time period is:

    $\text{Percentage change for a given period of time} = \frac{(\text{current price} - \text{prior price})}{\text{prior price}} \times 100$

    where "current price" is the price at the current time period, and "prior price" is the price at the previous time period.

2. **Analyze relationships between variables:** You can use seaborn to create scatter plots and regression plots to explore the relationships between variables. For example, to create a scatter plot of the opening price vs. the closing price:

    ```
    # create a scatter plot of Open vs. Close
    sns.scatterplot(data=df, x='Open', y='Close')
    plt.title('Deutsche Bank Open vs. Close')
    plt.show()
    ```

3. **Compare with market indices:** You can use pandas to download and compare the Deutsche Bank share data with market indices such as the DAX, FTSE, or S&P 500.

4. **Rolling statistics:** 

    Rolling statistics are a way to calculate summary statistics on a rolling window of data over a time series or other ordered data. Rolling statistics can help smooth out fluctuations and identify trends in the data. Here are some common rolling statistics:

- **Rolling Mean:** The rolling mean, also known as moving average, calculates the average of a rolling window of values. This is a common technique to smooth out fluctuations and identify trends in the data.

    _The rolling mean or moving average is a statistical measure that calculates the mean value of a series of data points over a specified period of time. In other words, it calculates the average of a rolling window of values._ 

    Mathematically, the rolling mean for a time series or other ordered data can be calculated as follows:

    $\mu_t = \text{(rolling mean)}_t = \frac{x[t-1] + x[t-2] + ... + x[t-n]}{n}$

    where `x` is the time series or ordered data, `t` is the current time period, and `n` is the window size or the number of data points included in the rolling window. The rolling mean at time `t` is calculated by taking the average of the `n` data points preceding `t`.

    The rolling mean is a common technique to smooth out fluctuations and identify trends in the data. By averaging out the noise and short-term fluctuations in the data, the rolling mean can help highlight longer-term patterns and trends. The choice of window size n determines the degree of smoothing: a larger window size will result in a smoother, slower-moving trend, while a smaller window size will result in a more volatile, faster-moving trend.

    Overall, the rolling mean is a useful tool for analyzing time series data, particularly when the data is noisy or exhibits short-term fluctuations. It can help identify trends, highlight patterns, and provide a clearer picture of the underlying data.

- **Rolling Standard Deviation:** The rolling standard deviation calculates the standard deviation of a rolling window of values. This can help identify periods of high or low volatility in the data.

    The rolling standard deviation is a statistical measure that calculates the standard deviation of a series of data points over a specified period of time. In other words, it calculates the standard deviation of a rolling window of values.

    Mathematically, the rolling standard deviation for a time series or other ordered data can be calculated as follows:

    $\sigma_t = \text{(rolling std)}_t = \sqrt{\frac{(x[t-1] - \mu_t)^2 + (x[t-2] - \mu_t)^2 + ... + (x[t-n] - \mu_t)^2}{n-1}}$

- **Rolling Correlation:** The rolling correlation is a statistical measure that calculates the correlation between two time series over a specified period of time. In other words, it calculates the correlation between a rolling window of values for each of the two time series.

    Mathematically, the rolling correlation for two time series or other ordered data can be calculated as follows:

    $\text{(rolling corr)}_t = \frac{\sum((x[t-i] - (\mu_x)_t) \times (y[t-i] - (\mu_y)_t)~ \text{for}~ i~ \text{in range}(n))}{(n-1) \times (\sigma_x)_t \times (\sigma_y)_t}$

    **Python code snippet:**
    ```
    import pandas as pd
    import numpy as np
    # Load data into a DataFrame
    data = pd.read_csv('stock_data.csv', index_col='Date')
    # Calculate the rolling mean and standard deviation
    rolling_mean = data['Close'].rolling(window=20).mean()
    rolling_std = data['Close'].rolling(window=20).std()
    # Combine the rolling statistics into a single DataFrame
    rolling_stats = pd.DataFrame({'Rolling Mean': rolling_mean, 'Rolling Std': rolling_std})
    # Print the last 10 rows of the DataFrame
    print(rolling_stats.tail(10))
    ```

    This code assumes that you have loaded data into a CSV file named stock_data.csv, with columns for 'Date', 'Open', 'High', 'Low', and 'Close'. The index_col argument is used to specify that the 'Date' column should be used as the index of the DataFrame.

    The code then uses the rolling method to calculate the rolling mean and standard deviation of the closing prices over a window of 20 days. The window argument is used to specify the size of the rolling window.

> **Correlation:** The correlation coefficient that indicates the strength of the relationship between two variables, `x`, `y` can be found using the following formula:
> $r_{xy}= \frac{\sum_i (x_i-\mu_x)(y_i-\mu_y)}{\sqrt{\sum_i (x_i-\mu_x)^2}\sum_i(y_i-\mu_y)^2}$.
> where 
> - $r_{xy}$ - the correlation coefficient of the linear relationship between the variables x and y
> - $x_i$ – the values of the x-variable in a sample
> - $\mu_x$ – the mean of the values of the x-variable
> - $y_i$ – the values of the y-variable in a sample
> - $\mu_y$ – the mean of the values of the y-variable.

5. **Moving average convergence divergence (MACD):** 

    MACD stands for Moving Average Convergence Divergence (MACD, or MAC-D) , which is a technical indicator used in finance to analyze stock price trends and momentum. 
    It is a trend-following momentum indicator that shows the relationship between two exponential moving averages (EMAs) of a security’s price. The MACD line is calculated by subtracting the 26-period EMA from the 12-period EMA. The result is plotted on a graph, and a 9-day EMA of the MACD is also plotted as a "Signal Line".

    The result of that calculation is the MACD line. A nine-day EMA of the MACD line is called the signal line, which is then plotted on top of the MACD line, which can function as a trigger for buy or sell signals. Traders may buy the security when the MACD line crosses above the signal line and sell—or short—the security when the MACD line crosses below the signal line. MACD indicators can be interpreted in several ways, but the more common methods are crossovers, divergences, and rapid rises/falls.

    The formula to calculate the Moving Average Convergence Divergence (MACD) is as follows:

    `MACD Line = 12-day Exponential Moving Average (EMA) of Closing Prices - 26-day EMA of Closing Prices`

    - Signal Line = 9-day EMA of MACD Line
    - MACD Histogram = MACD Line - Signal Line

    In this formula, the MACD Line is calculated by subtracting the 26-day EMA of the closing prices from the 12-day EMA of the closing prices. The Signal Line is then calculated by taking the 9-day EMA of the MACD Line. Finally, the MACD Histogram is calculated by subtracting the Signal Line from the MACD Line.

    **Key features:**
    1. MACD is best used with daily periods, where the traditional settings of 26/12/9 days is the norm.
    2. MACD triggers technical signals when the MACD line crosses above the signal line (to buy) or falls below it (to sell).
    3. MACD can help gauge whether a security is overbought or oversold, alerting traders to the strength of a directional move, and warning of a potential price reversal.
    4. MACD can also alert investors to bullish/bearish divergences (e.g., when a new high in price is not confirmed by a new high in MACD, and vice versa), suggesting a potential failure and reversal.
    5. After a signal line crossover, it is recommended to wait for three or four days to confirm that it is not a false move.
    
     <img src="MACD.png" width=750 height=400 />
    
    (Blue: MACD, Red: Signal)

    MACD is often displayed with a histogram (see the chart above) that graphs the distance between MACD and its signal line. If MACD is above the signal line, the histogram will be above the MACD’s baseline, or zero line. If MACD is below its signal line, the histogram will be below the MACD’s baseline. Traders use the MACD’s histogram to identify when bullish or bearish momentum is high—and possibly overbought/oversold.

| Relative Strength (RSI) index | MACD |
|-------------------------------|------|
| The relative strength index (RSI) aims to signal whether a market is considered to be overbought or oversold in relation to recent price levels. The RSI is an oscillator that calculates average price gains and losses over a given period of time. The default time period is 14 periods with values bounded from 0 to 100. A reading above 70 suggests an overbought condition, while a reading below 30 is considered oversold, with both potentially signaling a top is forming, or vice versa (a bottom is forming). | The MACD lines, however, do not have concrete overbought/oversold levels like the RSI and other oscillator studies. Rather, they function on a relative basis. That’s to say an investor or trader should focus on the level and direction of the MACD/signal lines compared with preceding price movements in the security at hand, as shown below. |

> **NOTE:**
> In the stock market, the terms "bullish" and "bearish" are used to describe the general sentiment or outlook of investors towards a particular stock, market, or the overall economy.

> 1.  A bullish market or stock means that investors are optimistic and confident about its prospects and believe that the stock or market is likely to rise in value. In a bullish market, investors may be buying more stocks, and there is generally more demand than supply for shares, which drives the prices higher.

> 2. A bearish market or stock means that investors are pessimistic and uncertain about its prospects and believe that the stock or market is likely to fall in value. In a bearish market, investors may be selling more stocks, and there is generally more supply than demand for shares, which drives the prices lower.

> There are several factors that can cause a market or stock to become bullish or bearish, including economic indicators such as interest rates, inflation, and GDP growth, as well as company-specific factors such as earnings reports, management changes, and industry trends.

> It's important to note that a bullish or bearish sentiment is not a guarantee of future performance, and stock prices can be affected by a wide range of factors. As a result, investors should conduct their own research and analysis before making any investment decisions.

**Python code snippet:**

```
import pandas as pd
import numpy as np
# Load historical price data into a DataFrame
data = pd.read_csv('stock_data.csv')
# Calculate the 12-day and 26-day exponential moving averages (EMAs)
ema12 = data['Close'].ewm(span=12).mean()
ema26 = data['Close'].ewm(span=26).mean()
# Calculate the MACD line
macd_line = ema12 - ema26
# Calculate the 9-day EMA of the MACD line (signal line)
signal_line = macd_line.ewm(span=9).mean()
# Calculate the MACD histogram
macd_histogram = macd_line - signal_line
# Combine the MACD values into a single DataFrame
macd_dataframe = pd.DataFrame({'MACD Line': macd_line, 'Signal Line': signal_line, 'MACD Histogram': macd_histogram})
# Print the last 10 rows of the DataFrame
print(macd_dataframe.tail(10))
```

This code assumes that you have loaded historical price data into a CSV file named stock_data.csv, with columns for 'Date', 'Open', 'High', 'Low', and 'Close'.

The code first calculates the 12-day and 26-day exponential moving averages (EMAs) of the closing prices using the ewm method of the DataFrame.

The code then calculates the MACD line by subtracting the 26-day EMA from the 12-day EMA.

Next, the code calculates the 9-day EMA of the MACD line to generate the signal line.

Finally, the code calculates the MACD histogram by subtracting the signal line from the MACD line.

The code combines the MACD line, signal line, and MACD histogram values into a single DataFrame named macd_dataframe and prints the last 10 rows of the DataFrame to verify that the calculation is correct.

You can adjust the parameters, such as the periods of the EMAs or the signal line, to suit your needs.

Reference: [Moving average convergence divergence](https://www.investopedia.com/terms/m/macd.asp#:~:text=Moving%20average%20convergence%2Fdivergence)


**Note:** What are Exponential Moving Average (EMA)?
An exponential moving average (EMA) is a type of moving average (MA) that places a greater weight and significance on the most recent data points. The exponential moving average is also referred to as the exponentially weighted moving average. An exponentially weighted moving average reacts more significantly to recent price changes than a simple moving average simple moving average (SMA), which applies an equal weight to all observations in the period.

Formula:

$\text{EMA}_{\rm today}= \left(\text{Value}_\text{today}\times \left(\frac{\text{Smoothing}}{1+\text{Days}}\right)\right) + \text{EMA}_{\rm yesterday}\times \left(1-\left(\frac{\text{Smoothing}}{1+\text{Days}}\right) \right)$

where

- EMA = Exponential moving average

while there are many possible choices for the smoothing factor, the most common choice is 2, i.e. 

- smoothing= 2

that gives the most recent observation more weight. If the smoothing factor is increased, more recent observations have more influence on the EMA.

6. **Bollinger Bands:** 

    Bollinger Bands are a kind of trading envelope. They are lines plotted at an interval around a moving average. Bollinger Bands consist of a moving average and two standard deviations charted as one line above and one line below the moving average. The line above is two standard deviations added to the moving average. The line below is two standard deviations subtracted from the moving average. Traders generally use them to determine overbought and oversold zones, to confirm divergences between prices and indicators, and to project price targets. The wider the bands are, the greater the volatility is. The narrower the bands are, the lesser the volatility is. The moving average is calculated on the close.

    > **PARAMETERS:** Period & Standard deviation

    - Upper Band= moving average + $2\sigma$
    - Middle Band =  moving average
    - Lower Band = moving average - $2\sigma$

    We can use pandas to calculate Bollinger Bands, another popular technical indicator used in trading.
    
    **Python code snippet:**
    ```
    import pandas as pd
    import numpy as np
    # Load historical price data into a DataFrame
    data = pd.read_csv('stock_data.csv')
    # Calculate the 20-day simple moving average
    sma20 = data['Close'].rolling(window=20).mean()
    # Calculate the standard deviation of the closing prices
    std_dev = data['Close'].rolling(window=20).std()
    # Calculate the upper and lower Bollinger Bands
    upper_band = sma20 + 2 * std_dev
    lower_band = sma20 - 2 * std_dev
    # Combine the Bollinger Bands values into a single DataFrame
    bbands_dataframe = pd.DataFrame({'Upper Band': upper_band, 'Lower Band': lower_band})
    # Print the last 10 rows of the DataFrame
    print(bbands_dataframe.tail(10))
    ```
    This code assumes that you have loaded historical price data into a CSV file named stock_data.csv, with columns for 'Date', 'Open', 'High', 'Low', and 'Close'.

7. **Commodity Channel Index (CCI):**
    The Commodity Channel Index, CCI, is designed to detect beginning and ending market trends. The computational procedure standardizes market prices much like a standard score in statistics. The final index attempts to measure the deviation from normal or major changes in the market's trend.

    According to the original author, 70% to 80% of all price fluctuations fall within +100 and -100 as measured by the index. A thorough discussion of the Commodity Channel Index can be found in the October 1980 edition of Commodities magazine (now Futures).

    The trading rules for the CCI are as follows. Establish a long position when the CCI exceeds +100. Liquidate when the index drops below +100. For a short position, you use the -100 value as your reference point. Any value less than -100, e.g. -125, suggests a short position, while a rise to -85 tells you to liquidate your short position.
    
    > **PARAMETERS:** Period (20) - the number of bars, or period, used to calculate the study.

    The proper calculation of the CCI requires several steps. They are listed in the proper sequence below. Typical prices, using the high, low and close prices for the interval must be calculated. It is the simple arthimetic avergae of the three values. 

    **Step-1:** First we need to calculate-

    `TP=(Hight+Low+Closed)/3`

    - TPt represents the typical price.
    - Hightt is the highest price for this interval
    - Lowt is the lowest price for the interval
    - Closet is the closing price for this interval.

    **Step-2:** Next we need to find MDt-

    `TPAVGt = (TP1+TP2+.....+TPn)/n`

    - TPAVGt is the moving average of the typical price.
    - TPn is the typical price for the nth interval.
    - n is the number of intervals for the average.

    **Step-3:** Next one is to find mean deviation. The formula is 

    `MDt= (|TP1-TPAVG1|+ ......+ |TPn- TPAVGn|)/n`

    - MDt is the mean deviation for this interval
    - TPn is the typical price for the nth interval
    - TPAVGn is the moving avergae of the typical price for the nth interval
    - n is the number of intervals.

    The symbol `||` designetes absolute value. In mathematical terms, negative differences are treated as positive values. 
    
    **Step-4:** Now, the computation for the final CCI value is-

    `CCIt = (TPt-TPAVGt)/(0.015*MDT)`

    - CCIt is the commodity channel index for the current period.
    - TPt is the typical price for the current period.
    - TPAVGt is the moving avergae of the typical price.
    - 0.015 is the constant
    - MDT is the mean deviation for the period.

    **Python code snippet:** A python code to calculate CCI for a data set 'stock_data.csv' is as follows:

    ```
    # Load historical price data into a DataFrame
    data = pd.read_csv('stock_data.csv')
    # Calculate typical price
    typical_price = (data['High'] + data['Low'] + data['Close']) / 3
    # Calculate the 20-day simple moving average of typical price
    sma20 = typical_price.rolling(window=20).mean()
    # Calculate the mean deviation
    mean_deviation = abs(typical_price - sma20).rolling(window=20).mean()
    # Calculate the CCI
    cci = (typical_price - sma20) / (0.015 * mean_deviation)
    # Combine the CCI values into a single DataFrame
    cci_dataframe = pd.DataFrame({'CCI': cci})
    # Print the last 10 rows of the DataFrame
    print(cci_dataframe.tail(10))
    ```
    This code assumes that you have loaded historical price data into a CSV file named stock_data.csv, with columns for 'Date', 'Open', 'High', 'Low', and 'Close'.

8. **Momentum (MOM):**

    Momentum is a technical analysis tool used by traders and investors to measure the rate of change in the price of a financial asset. It helps to identify the strength and direction of a trend by comparing the current price to the price from a specified number of periods ago.

    Momentum monitors the change in prices. It tells you whether prices are increasing at an increasing rate or decreasing at a decreasing rate. 

    - Is the market trend about to change? 
    - Is the market overbought or oversold? 

    Momentum may help us to find those market conditions.

    **Formula:** Momentum is calculated using a simple formula:

    `Momentum = Current Price - Price n periods ago`

    Where `n` represents the number of periods, which could be days, weeks, or months, depending on the timeframe being analyzed.

    Momentum can be calculated by computing the continuous difference between prices at fixed intervals. That difference is either a positive or negative value. 
    - When momentum is above the zero line and rising, prices are increasing at an increasing rate. 
    - If momentum is above the zero line but is declining, prices are still increasing but at a decreasing rate.

    The opposite is true when momentum falls below the zero line. 
    - If momentum is falling and is below the zero line, prices are decreasing at an increasing rate. 
    - With momentum below the zero line and rising, prices are still declining but at a decreasing rate.

    Traders use momentum to identify potential trend reversals, as well as to confirm the strength of an existing trend. When momentum is positive and rising, it suggests that the price is likely to continue to increase. Conversely, when momentum is negative and falling, it suggests that the price is likely to continue to decrease.

    Momentum can also be used in conjunction with other technical indicators, such as moving averages, to generate buy and sell signals. For example, a trader may look for a crossover of the moving average and momentum lines as a signal to enter or exit a position.

    It's important to note that momentum is just one tool among many used in technical analysis and should not be relied upon solely to make trading decisions. It's essential to consider other factors, such as fundamental analysis, market conditions, and risk management, before making any investment decisions.

    **Python code snippet:**
    ```
    import pandas as pd
    # Load historical price data into a DataFrame
    data = pd.read_csv('stock_data.csv')
    # Calculate the price difference over the past 10 periods
    mom = data['Close'] - data['Close'].shift(10)
    # Combine the MOM values into a single DataFrame
    momentum = pd.DataFrame({'MOM': mom})
    # Print the last 10 rows of the DataFrame
    print(momentum.tail(10))
    ```

    This code assumes that you have loaded historical price data into a CSV file named stock_data.csv, with columns for 'Date', 'Open', 'High', 'Low', and 'Close'.

9. **Stochastic (STO):**

    Stochastic (STO) is a popular technical indicator used by traders to identify potential trend reversals and overbought or oversold conditions in the market. It measures the relationship between a security's closing price and its price range over a specified period, typically 14 days.

    The Stochastic oscillator consists of two lines: 
    - the `%K` line and 
    - the `%D` line. 
    
    The `%K` line represents the current price's position relative to the highest and lowest prices over a specified period, while the `%D` line is a moving average of the `%K` line.

    The mathematical formulas to calculate the Stochastic oscillator are as follows:

    $\%K = \frac{C- L14}{H14-L14}\times 100$

    where

    - C = The most recent closing price
    - L14 = The lowest price traded of the 14 previous trading sessions
    - H14 = The highest price traded during the same 14-day period
    - %K = The current value of the stochastic indicator

    
    $\%D = 3~\text{-day SMA of}~ \%K$

    Where:
    - `3-day SMA`: the simple moving average of the %K line over the past three periods
    
    Notably, %K is referred to sometimes as the fast stochastic indicator. The "slow" stochastic indicator is taken as %D = 3-period moving average of %K. 

    >  **3-day SMA:** A 3-day SMA is a simple moving average that is calculated by adding up the closing prices of an asset over the past three trading days and dividing the sum by three.
    
    The resulting values for `%K` and `%D` will range from 0 to 100, with higher values indicating overbought conditions and lower values indicating oversold conditions. Typically, when the %K line crosses above the `%D` line, it is considered a bullish signal, while a bearish signal is generated when the `%K` line crosses below the `%D` line.

    Traders may also use the Stochastic oscillator in conjunction with other technical indicators, such as trendlines or moving averages, to confirm potential signals or filter out false ones.

    It's important to note that no single technical indicator can provide perfect trading signals, and traders should always consider multiple factors, including fundamental analysis, market conditions, and risk management, before making any trading decisions.

| RSI | STO |
|-----|-----|
| The RSI is a momentum oscillator that measures the magnitude of recent price changes to evaluate whether a stock is overbought or oversold. The RSI compares the average gains of the stock's price during a given period (usually 14 days) to the average losses during the same period. The RSI is expressed as a value between 0 and 100, and a reading above 70 is typically considered overbought, while a reading below 30 is typically considered oversold. | The STO, on the other hand, is a momentum indicator that compares a stock's closing price to its price range over a given period. The STO consists of two lines, %K and %D, which are both expressed as values between 0 and 100. The %K line measures the current price in relation to the highest and lowest prices over a given period, while the %D line is a moving average of the %K line. When the %K line crosses above the %D line, it is seen as a bullish signal, indicating that the stock may be about to rise, and vice versa. |

    **Python codes:** 
    ```
    import pandas as pd
    # Load historical price data into a DataFrame
    data = pd.read_csv('stock_data.csv')
    # Calculate the highest high and lowest low over the past 14 periods
    high14 = data['High'].rolling(14).max()
    low14 = data['Low'].rolling(14).min()
    # Calculate the %K line
    k = 100 * ((data['Close'] - low14) / (high14 - low14))
    # Calculate the 3-day simple moving average of %K to get %D line
    d = k.rolling(3).mean()
    # Combine %K and %D into a single DataFrame
    stochastic = pd.DataFrame({'%K': k, '%D': d})
    # Print the last 10 rows of the DataFrame
    print(stochastic.tail(10))
    ```

    This code assumes that you have loaded historical price data into a CSV file named stock_data.csv, with columns for 'Date', 'Open', 'High', 'Low', and 'Close'.

10. **Correlation:** Correlation is a statistical measure that describes the degree of linear relationship between two variables. It is a measure of the strength and direction of the relationship between two variables.

    The formula to calculate the correlation coefficient (r) between two variables X and Y is as follows:

    $r = \frac{n \sum XY - \sum X \sum Y}{\sqrt{(nΣX^2 - (ΣX)^2)(nΣY^2 - (ΣY)^2)}}$

    where:

    - `ΣXY` is the sum of the products of the corresponding values of `X` and `Y`
    - `ΣX` and `ΣY` are the sums of the values of `X` and `Y`, respectively
    - `ΣX^2` and `ΣY^2` are the sums of the squares of the values of `X` and `Y`, respectively

    `n` is the number of data points

    The correlation coefficient `r` is a value between `-1` and `1`, where 
    - `-1` indicates a perfect negative correlation (i.e., as X increases, Y decreases), 
    - `0` indicates no correlation, and 
    - `1` indicates a perfect positive correlation (i.e., as X increases, Y also increases).

    It is important to note that correlation does not necessarily imply causation - just because two variables are correlated does not mean that one causes the other. Correlation is simply a measure of the strength of the relationship between the two variables.

    **Python code to calculate correlation:**

    ```
    import pandas as pd
    # Load data into a DataFrame
    data = pd.read_csv('data.csv')
    # Calculate the correlation between two columns
    corr = data['Column1'].corr(data['Column2'])
    # Print the correlation coefficient
    print('Correlation coefficient:', corr)
    ```

    In this code, replace 'Column1' and 'Column2' with the names of the columns for which you want to calculate the correlation coefficient. The corr() function calculates the Pearson correlation coefficient, which is a measure of the linear relationship between two variables. It returns a value between -1 and 1, where -1 indicates a perfectly negative correlation, 0 indicates no correlation, and 1 indicates a perfectly positive correlation.

    **Correlation Matrix:**

    We can also calculate the correlation matrix for a DataFrame by calling the corr() method on the DataFrame. This will return a matrix of all pairwise correlations between the columns in the DataFrame. Here's an example:

    ```
    import pandas as pd
    # Load data into a DataFrame
    data = pd.read_csv('data.csv')
    # Calculate the correlation matrix
    corr_matrix = data.corr()
    # Print the correlation matrix
    print(corr_matrix)
    ```

    This will print a matrix where each entry is the correlation coefficient between two columns in the DataFrame. The diagonal of the matrix will always be 1, since each column is perfectly correlated with itself.

11. **Volatility:** In finance, volatility refers to the degree of variation of a financial instrument's price over time. It is often measured as the standard deviation of the instrument's returns over a specific time period. The formula for calculating volatility is as follows:

$\text{volatility} = \sqrt{\frac{\sum{(R_i - R_{\rm avg})^2}}{(n-1)}}$

Where:

- $R_i$ is the return for a given day or time period
- $R_{\rm avg}$ is the average return over the same period
- $n$ is the number of days or time periods being analyzed

The formula can be simplified as follows:

`volatility = sqrt(variance)`

Where:

- variance is the average of the squared differences between each day's return and the average return over the same period.

In practice, the volatility of a financial instrument is usually calculated using historical data, and may be annualized or adjusted for other factors such as risk-free rates or dividends.

### Multiple datasets and compartive study
If you have data for three more companies with the same data structure as the Deutsche Bank share data, there are many things you can do with pandas and other Python libraries. Here are some ideas:

1. **Merge the data:** You can use pandas to merge the data for all four companies into a single DataFrame, allowing you to analyze and compare the data more easily. For example, to merge the Deutsche Bank, Apple, Microsoft, and Amazon data into a single DataFrame:

    ```
    # read in the data for all four companies
    db = pd.read_csv('deutsche_bank.csv')
    apple = pd.read_csv('apple.csv')
    microsoft = pd.read_csv('microsoft.csv')
    amazon = pd.read_csv('amazon.csv')
    # merge the data for all four companies on the 'Date' column
    merged_data = pd.merge(db, apple, on='Date', suffixes=('_db', '_apple'))
    merged_data = pd.merge(merged_data, microsoft, on='Date', suffixes=('_db', '_msft'))
    merged_data = pd.merge(merged_data, amazon, on='Date', suffixes=('_db', '_amzn'))
    ```
2. **Visualize the data:** You can use matplotlib or seaborn to create visualizations of the data for each company and compare them. For example, to plot the closing price of each company over time:

    ```
    # plot the closing price for each company over time
    plt.plot(db['Date'], db['Close'], label='Deutsche Bank')
    plt.plot(apple['Date'], apple['Close'], label='Apple')
    plt.plot(microsoft['Date'], microsoft['Close'], label='Microsoft')
    plt.plot(amazon['Date'], amazon['Close'], label='Amazon')
    plt.title('Closing Price Over Time')
    plt.xlabel('Date')
    plt.ylabel('Closing Price')
    plt.legend()
    plt.show()
    ```

3. **Calculate summary statistics:** You can use pandas to calculate summary statistics for each company, such as the mean, standard deviation, and correlation coefficient. For example, to calculate the mean closing price and volume for each company:

    ```
    # calculate mean closing price and volume for each company
    mean_close = merged_data[['Close_db', 'Close_apple', 'Close_msft', 'Close_amzn']].mean()
    mean_volume = merged_data[['Volume_db', 'Volume_apple', 'Volume_msft', 'Volume_amzn']].mean()

    print('Mean closing price:')
    print(mean_close)
    print('Mean volume:')
    print(mean_volume)
    ```