# Chapter 1: Pandas Foundations

## Recipes
* [Dissecting the anatomy of a DataFrame](#Dissecting-the-anatomy-of-a-DataFrame)
* [Accessing the main DataFrame components](#Accessing-the-main-DataFrame-components)
* [Understanding data types](#Understanding-data-types)
* [Selecting a single column of data as a Series](#Selecting-a-single-column-of-data-as-a-Series)
* [Calling Series methods](#Calling-Series-methods)
* [Working with operators on a Series](#Working-with-operators-on-a-Series)
* [Chaining Series methods together](#Chaining-Series-methods-together)
* [Making the index meaningful](#Making-the-index-meaningful)
* [Renaming row and column names](#Renaming-row-and-column-names)
* [Creating and deleting columns](#Creating-and-deleting-columns)

In [157]:
import pandas as pd
import numpy as np

# Dissecting the anatomy of a DataFrame

#### Change options to get specific output for book

In [158]:
# pd.set_option('max_columns', 8, 'max_rows', 10)
pd.set_option('display.max_columns', 8)
pd.set_option('display.max_rows', 21)

In [159]:
signals = pd.read_csv('data/mql5_signals_mt4.csv', index_col='Signal')
signals.head()

Unnamed: 0_level_0,Rating,Signals,Price,Growth,...,Maximal consecutive loss orders,Maximal balance drawdown pct,Relative drawdown by balance amount,Relative drawdown by equity amount
Signal,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
1371953,1,Activity,30,3,...,8,11.87,2505.44,3553.24
2020078,10,EURSpecial,30,823,...,4,6.09,53.3,342.76
2105300,6,Arnold,30,1,...,9,16.19,802.9,329.8
2049959,9,MAX NoLimit,50,692,...,7,5.81,753.57,2483.64
2145152,12,PabloFX Safe,30,677,...,1,2.93,14.1,500.97


![dataframe anatomy](../images/ch01_dataframe_anatomy.png)

# Accessing the main DataFrame components

In [160]:
columns = signals.columns
index = signals.index
data = signals.values

In [161]:
columns

Index(['Rating', 'Signals', 'Price', 'Growth', 'Subscribers', 'Funds',
       'Balance', 'Weeks', 'Drawdown', 'Trades', 'Profit Trades',
       'Loss Trades', 'Best trade', 'Worst trade', 'Gross Profit',
       'Gross Loss', 'Maximum consecutive wins', 'Maximal consecutive profit',
       'Sharpe Ratio', 'Trading activity', 'Max deposit load', 'Latest trade',
       'Trades per week', 'Avg holding time', 'Recovery Factor', 'Long Trades',
       'Short Trades', 'Profit Factor', 'Expected Payoff', 'Average Profit',
       'Average Loss', 'Maximum consecutive losses',
       'Maximal consecutive loss', 'Monthly growth', 'Annual Forecast',
       'Algo trading', 'Absolute balance drawdown', 'Maximal balance drawdown',
       'Relative drawdown by balance', 'Relative drawdown by equity', 'Months',
       'Average By Month', 'Profit Trades Pct', 'Loss Trades Pct',
       'Gross Profit Pct', 'Gross Loss Pips',
       'Maximum consecutive wins amount', 'Maximal consecutive profit orders',
    

In [162]:
index

Index([1371953, 2020078, 2105300, 2049959, 2145152, 1463352, 2084523,  597211,
       2195366, 2115303,
       ...
       2192054, 1722005, 1808747, 2201189, 2216572, 2211292, 2207077, 1566301,
       2158945, 1614162],
      dtype='int64', name='Signal', length=4520)

In [163]:
data

array([['1', 'Activity', 30, ..., 11.87, 2505.44, 3553.24],
       ['10', 'EURSpecial', 30, ..., 6.09, 53.3, 342.76],
       ['6', 'Arnold', 30, ..., 16.19, 802.9, 329.8],
       ...,
       ['5688', 'High Risk', 30, ..., 39.86, 1539.94, 8806.67],
       ['5693', '2', 30, ..., 92.19, 1081.51, 210.08],
       ['5720', '345', 30, ..., 59.0, 15659.99, 4213.67]], dtype=object)

In [164]:
type(index)

pandas.core.indexes.base.Index

In [165]:
type(columns)

pandas.core.indexes.base.Index

In [166]:
type(data)

numpy.ndarray

In [167]:
issubclass(pd.RangeIndex, pd.Index)

True

## There's more

In [168]:
index.values

array([1371953, 2020078, 2105300, ..., 1566301, 2158945, 1614162])

In [169]:
columns.values

array(['Rating', 'Signals', 'Price', 'Growth', 'Subscribers', 'Funds',
       'Balance', 'Weeks', 'Drawdown', 'Trades', 'Profit Trades',
       'Loss Trades', 'Best trade', 'Worst trade', 'Gross Profit',
       'Gross Loss', 'Maximum consecutive wins',
       'Maximal consecutive profit', 'Sharpe Ratio', 'Trading activity',
       'Max deposit load', 'Latest trade', 'Trades per week',
       'Avg holding time', 'Recovery Factor', 'Long Trades',
       'Short Trades', 'Profit Factor', 'Expected Payoff',
       'Average Profit', 'Average Loss', 'Maximum consecutive losses',
       'Maximal consecutive loss', 'Monthly growth', 'Annual Forecast',
       'Algo trading', 'Absolute balance drawdown',
       'Maximal balance drawdown', 'Relative drawdown by balance',
       'Relative drawdown by equity', 'Months', 'Average By Month',
       'Profit Trades Pct', 'Loss Trades Pct', 'Gross Profit Pct',
       'Gross Loss Pips', 'Maximum consecutive wins amount',
       'Maximal consecutive profit

# Understanding data types

In [170]:
signals = pd.read_csv('data/mql5_signals_mt4.csv', index_col='Signal')

In [171]:
signals.dtypes

Rating                                  object
Signals                                 object
Price                                    int64
Growth                                   int64
Subscribers                              int64
                                        ...   
Maximum consecutive losses amount      float64
Maximal consecutive loss orders          int64
Maximal balance drawdown pct           float64
Relative drawdown by balance amount    float64
Relative drawdown by equity amount     float64
Length: 55, dtype: object

In [172]:
signals.value_counts()

Rating  Signals          Price  Growth  Subscribers  Funds  Balance  Weeks  Drawdown  Trades  Profit Trades  Loss Trades  Best trade  Worst trade  Gross Profit  Gross Loss  Maximum consecutive wins  Maximal consecutive profit  Sharpe Ratio  Trading activity  Max deposit load  Latest trade  Trades per week  Avg holding time  Recovery Factor  Long Trades  Short Trades  Profit Factor  Expected Payoff  Average Profit  Average Loss  Maximum consecutive losses  Maximal consecutive loss  Monthly growth  Annual Forecast  Algo trading  Absolute balance drawdown  Maximal balance drawdown  Relative drawdown by balance  Relative drawdown by equity  Months  Average By Month  Profit Trades Pct  Loss Trades Pct  Gross Profit Pct  Gross Loss Pips  Maximum consecutive wins amount  Maximal consecutive profit orders  Long Trades Pct  Short Trades Pct  Maximum consecutive losses amount  Maximal consecutive loss orders  Maximal balance drawdown pct  Relative drawdown by balance amount  Relative drawdown by

# Selecting a single column of data as a Series

![dataframe anatomy](../images/ch01_series_anatomy.png)

In [173]:
signals = pd.read_csv('data/mql5_signals_mt4.csv', index_col='Signal')

In [174]:
signals['Growth']

Signal
1371953      3
2020078    823
2105300      1
2049959    692
2145152    677
          ... 
2211292     65
2207077     70
1566301     65
2158945     66
1614162     70
Name: Growth, Length: 4520, dtype: int64

In [175]:
signals['Growth'].dtypes

dtype('int64')

In [176]:
type(signals['Growth'])

pandas.core.series.Series

## There's more

In [177]:
signal_growth = signals['Growth']
signal_growth.name

'Growth'

In [178]:
signal_growth.to_frame().head()

Unnamed: 0_level_0,Growth
Signal,Unnamed: 1_level_1
1371953,3
2020078,823
2105300,1
2049959,692
2145152,677


# Calling Series methods

## Getting ready...

In [179]:
s_attr_methods = set(dir(pd.Series))
len(s_attr_methods)

411

In [180]:
df_attr_methods = set(dir(pd.DataFrame))
len(df_attr_methods)

427

In [181]:
len(s_attr_methods & df_attr_methods)

357

## How to do it...

In [182]:
signals = pd.read_csv('data/mql5_signals_mt4.csv', index_col='Signal')
signals.columns

Index(['Rating', 'Signals', 'Price', 'Growth', 'Subscribers', 'Funds',
       'Balance', 'Weeks', 'Drawdown', 'Trades', 'Profit Trades',
       'Loss Trades', 'Best trade', 'Worst trade', 'Gross Profit',
       'Gross Loss', 'Maximum consecutive wins', 'Maximal consecutive profit',
       'Sharpe Ratio', 'Trading activity', 'Max deposit load', 'Latest trade',
       'Trades per week', 'Avg holding time', 'Recovery Factor', 'Long Trades',
       'Short Trades', 'Profit Factor', 'Expected Payoff', 'Average Profit',
       'Average Loss', 'Maximum consecutive losses',
       'Maximal consecutive loss', 'Monthly growth', 'Annual Forecast',
       'Algo trading', 'Absolute balance drawdown', 'Maximal balance drawdown',
       'Relative drawdown by balance', 'Relative drawdown by equity', 'Months',
       'Average By Month', 'Profit Trades Pct', 'Loss Trades Pct',
       'Gross Profit Pct', 'Gross Loss Pips',
       'Maximum consecutive wins amount', 'Maximal consecutive profit orders',
    

In [183]:
avg_by_month = signals['Average By Month']
monthly_growth = signals['Monthly growth']

In [184]:
avg_by_month.head()

Signal
1371953    15.49
2020078    17.48
2105300    14.05
2049959     3.91
2145152    23.44
Name: Average By Month, dtype: float64

In [185]:
monthly_growth.head()

Signal
1371953    5.06
2020078    6.24
2105300    1.96
2049959    1.52
2145152    9.57
Name: Monthly growth, dtype: float64

In [186]:
# pd.set_option('max_rows', 8)
pd.set_option('display.max_rows', 10)
weeks = signals['Weeks']
weeks.value_counts()

Weeks
8      148
2      132
3      123
13     114
9      111
      ... 
145      1
419      1
268      1
438      1
460      1
Name: count, Length: 287, dtype: int64

In [187]:
months = signals['Months']
months.value_counts()

Months
3      463
4      454
2      450
1      392
6      287
      ... 
122      1
101      1
97       1
78       1
107      1
Name: count, Length: 91, dtype: int64

In [188]:
months.size

4520

In [189]:
months.shape

(4520,)

In [190]:
len(months)

4520

In [191]:
months.count()

4520

In [192]:
monthly_growth.count()

4515

In [193]:
monthly_growth.quantile() # 50th percentile by default

5.98

In [194]:
monthly_growth.quantile([0.13, 0.21, 0.34, 0.55, 0.89])

0.13     1.3100
0.21     2.1400
0.34     3.5576
0.55     7.1070
0.89    26.6746
Name: Monthly growth, dtype: float64

In [195]:
monthly_growth.min(), monthly_growth.max(), \
monthly_growth.mean(), monthly_growth.median(), \
monthly_growth.std(), monthly_growth.sum()

(0.0, 245.5, 11.440863787375415, 5.98, 16.074610083265693, 51655.5)

In [196]:
monthly_growth.describe()

count    4515.000000
mean       11.440864
std        16.074610
min         0.000000
25%         2.560000
50%         5.980000
75%        14.130000
max       245.500000
Name: Monthly growth, dtype: float64

In [197]:
avg_by_month.describe()

count    4520.000000
mean        6.925144
std        10.311640
min       -63.210000
25%         2.090000
50%         4.690000
75%         9.752500
max        89.020000
Name: Average By Month, dtype: float64

In [198]:
avg_by_month.quantile(.2)

1.55

In [199]:
temp = avg_by_month
temp.dropna(inplace=True)
temp.sort_values(ascending=True).tail(13)

Signal
2220546    70.10
2183180    70.12
2187806    71.25
2203298    72.15
2193239    72.82
           ...  
2177787    81.63
2181709    81.70
2216464    85.77
2205624    88.78
2217753    89.02
Name: Average By Month, Length: 13, dtype: float64

In [200]:
# avg_by_month.quantile()
# avg_by_month.quantile(.2)
avg_by_month.quantile([.1, .2, .3, .4, .5, ])

0.1   -0.271
0.2    1.550
0.3    2.564
0.4    3.570
0.5    4.690
Name: Average By Month, dtype: float64

In [201]:
avg_by_month.quantile([.1, .2, .3, .4, .5, .6, .7, .8, .9])

0.1    -0.271
0.2     1.550
0.3     2.564
0.4     3.570
0.5     4.690
0.6     6.134
0.7     8.243
0.8    11.540
0.9    17.241
Name: Average By Month, dtype: float64

In [202]:
monthly_growth.isnull()

Signal
1371953    False
2020078    False
2105300    False
2049959    False
2145152    False
           ...  
2211292    False
2207077    False
1566301    False
2158945    False
1614162    False
Name: Monthly growth, Length: 4520, dtype: bool

In [203]:
monthly_growth.isnull().value_counts()

Monthly growth
False    4515
True        5
Name: count, dtype: int64

In [204]:
monthly_growth.value_counts()

Monthly growth
0.00     27
2.71     11
0.71     11
0.61     10
0.66     10
         ..
9.51      1
43.22     1
28.56     1
49.94     1
19.22     1
Name: count, Length: 2106, dtype: int64

In [205]:
signals.info()

<class 'pandas.core.frame.DataFrame'>
Index: 4520 entries, 1371953 to 1614162
Data columns (total 55 columns):
 #   Column                               Non-Null Count  Dtype  
---  ------                               --------------  -----  
 0   Rating                               4520 non-null   object 
 1   Signals                              4520 non-null   object 
 2   Price                                4520 non-null   int64  
 3   Growth                               4520 non-null   int64  
 4   Subscribers                          4520 non-null   int64  
 5   Funds                                4520 non-null   int64  
 6   Balance                              4520 non-null   int64  
 7   Weeks                                4520 non-null   int64  
 8   Drawdown                             4520 non-null   int64  
 9   Trades                               4520 non-null   int64  
 10  Profit Trades                        4520 non-null   int64  
 11  Loss Trades               

In [206]:
signals.columns[signals.isna().any()].tolist()

['Trading activity', 'Profit Factor', 'Monthly growth', 'Annual Forecast']

In [207]:
monthly_growth_filled = monthly_growth.fillna(0)
monthly_growth_filled.count()

4520

In [208]:
monthly_growth_dropped = monthly_growth.dropna()
monthly_growth_dropped.size

4515

## There's more...

In [209]:
weeks

Signal
1371953    117
2020078     62
2105300     79
2049959    238
2145152     39
          ... 
2211292     48
2207077    460
1566301    107
2158945     46
1614162    134
Name: Weeks, Length: 4520, dtype: int64

In [210]:
weeks.value_counts(normalize=True)

Weeks
8      0.032743
2      0.029204
3      0.027212
13     0.025221
9      0.024558
         ...   
145    0.000221
419    0.000221
268    0.000221
438    0.000221
460    0.000221
Name: proportion, Length: 287, dtype: float64

In [211]:
annual_forecast = signals['Annual Forecast']

In [212]:
annual_forecast.hasnans

True

In [213]:
annual_forecast.notnull()

Signal
1371953    True
2020078    True
2105300    True
2049959    True
2145152    True
           ... 
2211292    True
2207077    True
1566301    True
2158945    True
1614162    True
Name: Annual Forecast, Length: 4520, dtype: bool

# Working with operators on a Series

In [214]:
pd.options.display.max_rows = 6

In [215]:
5 + 9    # plus operator example. Adds 5 and 9

14

In [216]:
4 ** 2   # exponentiation operator. Raises 4 to the second power

16

In [217]:
a = 10   # assignment operator.

In [218]:
5 <= 9   # less than or equal to operator

True

In [219]:
'abcde' + 'fg'    # plus operator for strings. C

'abcdefg'

In [220]:
not (5 <= 9)      # not is an operator that is a reserved keyword and reverse a boolean

False

In [221]:
7 in [1, 2, 6]    # in operator checks for membership of a list

False

In [222]:
set([1,2,3]) & set([2,3,4])

{2, 3}

In [223]:
# [1, 2, 3] - 3   # TypeError: unsupported operand type(s) for -: 'list' and 'int'

In [224]:
# a = set([1,2,3])     
# a[0]                 # the indexing operator does not work with sets | TypeError: 'set' object does not support indexing

## Getting ready...

In [225]:
signals = pd.read_csv('data/mql5_signals_mt4.csv', index_col='Signal')
signals.columns

Index(['Rating', 'Signals', 'Price', 'Growth', 'Subscribers', 'Funds',
       'Balance', 'Weeks', 'Drawdown', 'Trades', 'Profit Trades',
       'Loss Trades', 'Best trade', 'Worst trade', 'Gross Profit',
       'Gross Loss', 'Maximum consecutive wins', 'Maximal consecutive profit',
       'Sharpe Ratio', 'Trading activity', 'Max deposit load', 'Latest trade',
       'Trades per week', 'Avg holding time', 'Recovery Factor', 'Long Trades',
       'Short Trades', 'Profit Factor', 'Expected Payoff', 'Average Profit',
       'Average Loss', 'Maximum consecutive losses',
       'Maximal consecutive loss', 'Monthly growth', 'Annual Forecast',
       'Algo trading', 'Absolute balance drawdown', 'Maximal balance drawdown',
       'Relative drawdown by balance', 'Relative drawdown by equity', 'Months',
       'Average By Month', 'Profit Trades Pct', 'Loss Trades Pct',
       'Gross Profit Pct', 'Gross Loss Pips',
       'Maximum consecutive wins amount', 'Maximal consecutive profit orders',
    

In [226]:
EP = signals['Expected Payoff']
EP

Signal
1371953    9.07
2020078    0.60
2105300    3.50
           ... 
1566301    1.40
2158945    0.10
1614162    0.63
Name: Expected Payoff, Length: 4520, dtype: float64

In [227]:
EP + 1

Signal
1371953    10.07
2020078     1.60
2105300     4.50
           ...  
1566301     2.40
2158945     1.10
1614162     1.63
Name: Expected Payoff, Length: 4520, dtype: float64

In [228]:
EP * 2.5

Signal
1371953    22.675
2020078     1.500
2105300     8.750
            ...  
1566301     3.500
2158945     0.250
1614162     1.575
Name: Expected Payoff, Length: 4520, dtype: float64

In [229]:
EP // 7

Signal
1371953    1.0
2020078    0.0
2105300    0.0
          ... 
1566301    0.0
2158945    0.0
1614162    0.0
Name: Expected Payoff, Length: 4520, dtype: float64

In [230]:
EP > 7

Signal
1371953     True
2020078    False
2105300    False
           ...  
1566301    False
2158945    False
1614162    False
Name: Expected Payoff, Length: 4520, dtype: bool

In [231]:
PF = signals['Profit Factor']

In [232]:
PF

Signal
1371953    2.12
2020078    2.10
2105300    2.14
           ... 
1566301    1.35
2158945    1.02
1614162    1.15
Name: Profit Factor, Length: 4520, dtype: float64

In [233]:
PF == 2.10

Signal
1371953    False
2020078     True
2105300    False
           ...  
1566301    False
2158945    False
1614162    False
Name: Profit Factor, Length: 4520, dtype: bool

## There's more...

In [234]:
PF.add(1)              # imdb_score + 1

Signal
1371953    3.12
2020078    3.10
2105300    3.14
           ... 
1566301    2.35
2158945    2.02
1614162    2.15
Name: Profit Factor, Length: 4520, dtype: float64

In [235]:
PF.mul(2.5)            # imdb_score * 2.5

Signal
1371953    5.300
2020078    5.250
2105300    5.350
           ...  
1566301    3.375
2158945    2.550
1614162    2.875
Name: Profit Factor, Length: 4520, dtype: float64

In [236]:
PF.floordiv(7)         # imdb_score // 7

Signal
1371953    0.0
2020078    0.0
2105300    0.0
          ... 
1566301    0.0
2158945    0.0
1614162    0.0
Name: Profit Factor, Length: 4520, dtype: float64

In [237]:
PF.gt(7)               # imdb_score > 7

Signal
1371953    False
2020078    False
2105300    False
           ...  
1566301    False
2158945    False
1614162    False
Name: Profit Factor, Length: 4520, dtype: bool

In [238]:
PF.eq(2.1)   # director == 'James Cameron'

Signal
1371953    False
2020078     True
2105300    False
           ...  
1566301    False
2158945    False
1614162    False
Name: Profit Factor, Length: 4520, dtype: bool

In [239]:
PF.dropna(inplace=True)
PF.astype(int).mod(5)

Signal
1371953    2
2020078    2
2105300    2
          ..
1566301    1
2158945    1
1614162    1
Name: Profit Factor, Length: 4475, dtype: int64

In [240]:
a = type(1)

In [241]:
type(a)

type

In [242]:
a = type(PF)

In [243]:
a([1,2,3])

0    1
1    2
2    3
dtype: int64

# Chaining Series methods together

In [244]:
signals = pd.read_csv('data/mql5_signals_mt4.csv', index_col='Signal')
signals

Unnamed: 0_level_0,Rating,Signals,Price,Growth,...,Maximal consecutive loss orders,Maximal balance drawdown pct,Relative drawdown by balance amount,Relative drawdown by equity amount
Signal,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
1371953,1,Activity,30,3,...,8,11.87,2505.44,3553.24
2020078,10,EURSpecial,30,823,...,4,6.09,53.30,342.76
2105300,6,Arnold,30,1,...,9,16.19,802.90,329.80
...,...,...,...,...,...,...,...,...,...
1566301,5688,High Risk,30,65,...,58,39.86,1539.94,8806.67
2158945,5693,2,30,66,...,11,92.19,1081.51,210.08
1614162,5720,345,30,70,...,1,59.00,15659.99,4213.67


In [245]:
growth = signals['Growth']
mbdp = signals['Maximal balance drawdown pct']

In [246]:
growth

Signal
1371953      3
2020078    823
2105300      1
          ... 
1566301     65
2158945     66
1614162     70
Name: Growth, Length: 4520, dtype: int64

In [247]:
growth.value_counts().head(10)

Growth
1    152
2    109
3     93
    ... 
7     83
9     77
8     73
Name: count, Length: 10, dtype: int64

In [248]:
mbdp.isnull().sum()

0

In [249]:
mbdp.dtype

dtype('float64')

In [250]:
mbdp.fillna(0)\
                .astype(int)\
                .head()

Signal
1371953    11
2020078     6
2105300    16
2049959     5
2145152     2
Name: Maximal balance drawdown pct, dtype: int64

## There's more...

In [251]:
mbdp.isnull().mean()

0.0

In [252]:
(mbdp.fillna(0)
                 .astype(int)
                 .head())

Signal
1371953    11
2020078     6
2105300    16
2049959     5
2145152     2
Name: Maximal balance drawdown pct, dtype: int64

# Making the index meaningful

In [253]:
signals = pd.read_csv('data/mql5_signals_mt4.csv')

In [254]:
signals.shape

(4520, 56)

In [255]:
signals2 = signals.set_index('Signal')
signals2

Unnamed: 0_level_0,Rating,Signals,Price,Growth,...,Maximal consecutive loss orders,Maximal balance drawdown pct,Relative drawdown by balance amount,Relative drawdown by equity amount
Signal,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
1371953,1,Activity,30,3,...,8,11.87,2505.44,3553.24
2020078,10,EURSpecial,30,823,...,4,6.09,53.30,342.76
2105300,6,Arnold,30,1,...,9,16.19,802.90,329.80
...,...,...,...,...,...,...,...,...,...
1566301,5688,High Risk,30,65,...,58,39.86,1539.94,8806.67
2158945,5693,2,30,66,...,11,92.19,1081.51,210.08
1614162,5720,345,30,70,...,1,59.00,15659.99,4213.67


In [256]:
signals = pd.read_csv('data/mql5_signals_mt4.csv', index_col='Signal')

# There's more...

In [257]:
signals.reset_index()

Unnamed: 0,Signal,Rating,Signals,Price,...,Maximal consecutive loss orders,Maximal balance drawdown pct,Relative drawdown by balance amount,Relative drawdown by equity amount
0,1371953,1,Activity,30,...,8,11.87,2505.44,3553.24
1,2020078,10,EURSpecial,30,...,4,6.09,53.30,342.76
2,2105300,6,Arnold,30,...,9,16.19,802.90,329.80
...,...,...,...,...,...,...,...,...,...
4517,1566301,5688,High Risk,30,...,58,39.86,1539.94,8806.67
4518,2158945,5693,2,30,...,11,92.19,1081.51,210.08
4519,1614162,5720,345,30,...,1,59.00,15659.99,4213.67


# Renaming row and column names

In [258]:
signals = pd.read_csv('data/mql5_signals_mt4.csv', index_col='Signal')
signals

Unnamed: 0_level_0,Rating,Signals,Price,Growth,...,Maximal consecutive loss orders,Maximal balance drawdown pct,Relative drawdown by balance amount,Relative drawdown by equity amount
Signal,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
1371953,1,Activity,30,3,...,8,11.87,2505.44,3553.24
2020078,10,EURSpecial,30,823,...,4,6.09,53.30,342.76
2105300,6,Arnold,30,1,...,9,16.19,802.90,329.80
...,...,...,...,...,...,...,...,...,...
1566301,5688,High Risk,30,65,...,58,39.86,1539.94,8806.67
2158945,5693,2,30,66,...,11,92.19,1081.51,210.08
1614162,5720,345,30,70,...,1,59.00,15659.99,4213.67


In [259]:
signals.columns

Index(['Rating', 'Signals', 'Price', 'Growth', 'Subscribers', 'Funds',
       'Balance', 'Weeks', 'Drawdown', 'Trades', 'Profit Trades',
       'Loss Trades', 'Best trade', 'Worst trade', 'Gross Profit',
       'Gross Loss', 'Maximum consecutive wins', 'Maximal consecutive profit',
       'Sharpe Ratio', 'Trading activity', 'Max deposit load', 'Latest trade',
       'Trades per week', 'Avg holding time', 'Recovery Factor', 'Long Trades',
       'Short Trades', 'Profit Factor', 'Expected Payoff', 'Average Profit',
       'Average Loss', 'Maximum consecutive losses',
       'Maximal consecutive loss', 'Monthly growth', 'Annual Forecast',
       'Algo trading', 'Absolute balance drawdown', 'Maximal balance drawdown',
       'Relative drawdown by balance', 'Relative drawdown by equity', 'Months',
       'Average By Month', 'Profit Trades Pct', 'Loss Trades Pct',
       'Gross Profit Pct', 'Gross Loss Pips',
       'Maximum consecutive wins amount', 'Maximal consecutive profit orders',
    

In [260]:
# idx_rename = {'Avatar':'Ratava', 'Spectre': 'Ertceps'} 
col_rename = {'Trading activity':'Activity', 'Subscribers': 'Customers'}

In [261]:
temp = signals.rename(columns=col_rename).head()    # signals.rename(columns=col_rename).head()
temp[['Activity', 'Customers']]


Unnamed: 0_level_0,Activity,Customers
Signal,Unnamed: 1_level_1,Unnamed: 2_level_1
1371953,91.6,26
2020078,42.59,15
2105300,2.11,9
2049959,88.4,142
2145152,86.35,31


# There's more

In [262]:
signals = pd.read_csv('data/mql5_signals_mt4.csv', index_col='Signal')
index = signals.index
columns = signals.columns

# # using tolist function
# index_list = index.tolist()
# column_list = columns.tolist()

# index_list[0] = 'Ratava'
# index_list[2] = 'Ertceps'
# column_list[1] = 'Director Name'
# column_list[2] = 'Critical Reviews'

In [263]:
# print(index_list[:5])

In [264]:
# print(column_list)

In [265]:
# signals.index = index_list
# signals.columns = column_list

In [266]:
signals.head()

Unnamed: 0_level_0,Rating,Signals,Price,Growth,...,Maximal consecutive loss orders,Maximal balance drawdown pct,Relative drawdown by balance amount,Relative drawdown by equity amount
Signal,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
1371953,1,Activity,30,3,...,8,11.87,2505.44,3553.24
2020078,10,EURSpecial,30,823,...,4,6.09,53.3,342.76
2105300,6,Arnold,30,1,...,9,16.19,802.9,329.8
2049959,9,MAX NoLimit,50,692,...,7,5.81,753.57,2483.64
2145152,12,PabloFX Safe,30,677,...,1,2.93,14.1,500.97


# Creating and deleting columns

In [267]:
signals = pd.read_csv('data/mql5_signals_mt4.csv', index_col='Signal')

In [268]:
signals['Score'] = 0

In [269]:
signals.columns

Index(['Rating', 'Signals', 'Price', 'Growth', 'Subscribers', 'Funds',
       'Balance', 'Weeks', 'Drawdown', 'Trades', 'Profit Trades',
       'Loss Trades', 'Best trade', 'Worst trade', 'Gross Profit',
       'Gross Loss', 'Maximum consecutive wins', 'Maximal consecutive profit',
       'Sharpe Ratio', 'Trading activity', 'Max deposit load', 'Latest trade',
       'Trades per week', 'Avg holding time', 'Recovery Factor', 'Long Trades',
       'Short Trades', 'Profit Factor', 'Expected Payoff', 'Average Profit',
       'Average Loss', 'Maximum consecutive losses',
       'Maximal consecutive loss', 'Monthly growth', 'Annual Forecast',
       'Algo trading', 'Absolute balance drawdown', 'Maximal balance drawdown',
       'Relative drawdown by balance', 'Relative drawdown by equity', 'Months',
       'Average By Month', 'Profit Trades Pct', 'Loss Trades Pct',
       'Gross Profit Pct', 'Gross Loss Pips',
       'Maximum consecutive wins amount', 'Maximal consecutive profit orders',
    

In [270]:
# create new actor_director_facebook_likes
signals['Positive_Pos'] = signals['Profit Trades'] >  signals['Loss Trades']
signals['Positive_Numbers'] = signals['Profit Trades'] -  signals['Loss Trades']

In [271]:
signals

Unnamed: 0_level_0,Rating,Signals,Price,Growth,...,Relative drawdown by equity amount,Score,Positive_Pos,Positive_Numbers
Signal,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
1371953,1,Activity,30,3,...,3553.24,0,True,1138
2020078,10,EURSpecial,30,823,...,342.76,0,True,760
2105300,6,Arnold,30,1,...,329.80,0,True,1150
...,...,...,...,...,...,...,...,...,...
1566301,5688,High Risk,30,65,...,8806.67,0,True,2262
2158945,5693,2,30,66,...,210.08,0,True,190
1614162,5720,345,30,70,...,4213.67,0,True,3329


In [272]:
signals['Positive_Pos'].isnull()

Signal
1371953    False
2020078    False
2105300    False
           ...  
1566301    False
2158945    False
1614162    False
Name: Positive_Pos, Length: 4520, dtype: bool

In [273]:
signals['Positive_Pos'].isnull().sum()

0

In [274]:
signals[['Maximum consecutive wins', 'Maximum consecutive losses']]

Unnamed: 0_level_0,Maximum consecutive wins,Maximum consecutive losses
Signal,Unnamed: 1_level_1,Unnamed: 2_level_1
1371953,61,11
2020078,28,6
2105300,25,10
...,...,...
1566301,52,58
2158945,19,12
1614162,39,35


In [275]:
# create new "is_cast_likes_more"
signals['is_consecutive_good'] = (signals['Maximum consecutive wins'] >= signals['Maximum consecutive losses'])

In [276]:
signals

Unnamed: 0_level_0,Rating,Signals,Price,Growth,...,Score,Positive_Pos,Positive_Numbers,is_consecutive_good
Signal,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
1371953,1,Activity,30,3,...,0,True,1138,True
2020078,10,EURSpecial,30,823,...,0,True,760,True
2105300,6,Arnold,30,1,...,0,True,1150,True
...,...,...,...,...,...,...,...,...,...
1566301,5688,High Risk,30,65,...,0,True,2262,False
2158945,5693,2,30,66,...,0,True,190,True
1614162,5720,345,30,70,...,0,True,3329,True


In [277]:
signals['is_consecutive_good'].value_counts()

is_consecutive_good
True     4087
False     433
Name: count, dtype: int64

In [278]:
signals['is_consecutive_good'].all()

False

In [279]:
signals = signals.drop('Positive_Numbers', axis='columns')
signals

Unnamed: 0_level_0,Rating,Signals,Price,Growth,...,Relative drawdown by equity amount,Score,Positive_Pos,is_consecutive_good
Signal,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
1371953,1,Activity,30,3,...,3553.24,0,True,True
2020078,10,EURSpecial,30,823,...,342.76,0,True,True
2105300,6,Arnold,30,1,...,329.80,0,True,True
...,...,...,...,...,...,...,...,...,...
1566301,5688,High Risk,30,65,...,8806.67,0,True,False
2158945,5693,2,30,66,...,210.08,0,True,True
1614162,5720,345,30,70,...,4213.67,0,True,True


## There's more...

In [282]:
signals

Unnamed: 0_level_0,Rating,Signals,Price,Growth,...,Relative drawdown by equity amount,Score,Positive_Pos,is_consecutive_good
Signal,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
1371953,1,Activity,30,3,...,3553.24,0,True,True
2020078,10,EURSpecial,30,823,...,342.76,0,True,True
2105300,6,Arnold,30,1,...,329.80,0,True,True
...,...,...,...,...,...,...,...,...,...
1566301,5688,High Risk,30,65,...,8806.67,0,True,False
2158945,5693,2,30,66,...,210.08,0,True,True
1614162,5720,345,30,70,...,4213.67,0,True,True


In [287]:
signals.iloc(1)

<pandas.core.indexing._iLocIndexer at 0x7fdce68359a0>