# 1.2 Generate Attributes

This is the second part of our complete workflow. We generate 6 attributes that will be used to train our classification models.

At the end of the workflow, the dataset will contain the following columns:
- Date
- is_profitable
- close-low
- close-high
- SMA-close
- EMA-close
- EMA_diff
- SMA_diff

In [None]:
# !pip install pandas

In [1]:
import pandas as pd

## Set parameters:


In [2]:
sma_range = 10
ema_range = 10
ema_smoothing = 2

## Import datasets

In [3]:
df = pd.read_csv('../generated-datasets/classification-dataset.csv', index_col=0)

## Process data for classification tasks

### Calculate simple moving average

In [4]:
df['SMA'] = df['Close'].rolling(sma_range).mean()

### Calculate exponential moving average

In [5]:
df['EMA'] = df['Close']
df.loc[ema_range-1, 'EMA'] = (df.iloc[ema_range-1])['SMA']
df['EMA'] = df.iloc[9:,9].ewm(span=ema_range,adjust=False).mean()

### Calulate intraday price differences

In [6]:
df['close-low'] = df.apply(lambda x: x['Close']-x['Low'], axis=1)
df['close-high'] = df.apply(lambda x: x['Close']-x['High'], axis=1)
df['SMA-close'] = df.apply(lambda x: x['SMA']-x['Close'], axis=1)
df['EMA-close'] = df.apply(lambda x: x['EMA']-x['Close'], axis=1)


In [7]:
df['prev_EMA'] = df['EMA'].shift(1)
df['EMA_diff'] = df['EMA'] - df['prev_EMA']
df.drop('prev_EMA', axis=1, inplace=True)

In [8]:
df['prev_SMA'] = df['SMA'].shift(1)
df['SMA_diff'] = df['SMA'] - df['prev_SMA']
df.drop('prev_SMA', axis=1, inplace=True)

### Remove tuples with NA values

In [9]:
df.dropna(inplace=True)

### Drop unused columns

In [11]:
df.drop(['Open','High','Low','Close','Volume','Market Cap', 'SMA', 'EMA'], axis=1, inplace=True)

## Export datasets to csv

In [12]:
df.to_csv('../generated-datasets/classification-dataset.csv')