# Simple Classification Example

## Install _SeqRep_ package

In [1]:
!python -m pip install git+https://github.com/MIR-MU/seqrep

Collecting git+https://github.com/MIR-MU/seqrep
  Cloning https://github.com/MIR-MU/seqrep to /tmp/pip-req-build-prj6kk23
  Running command git clone -q https://github.com/MIR-MU/seqrep /tmp/pip-req-build-prj6kk23
Building wheels for collected packages: seqrep
  Building wheel for seqrep (setup.py) ... [?25l[?25hdone
  Created wheel for seqrep: filename=seqrep-0.0.0-py3-none-any.whl size=7728 sha256=0dcb63331cb1da1fe45e54b2f8e54c1e149baa0798a0210391bde280a25a70fd
  Stored in directory: /tmp/pip-ephem-wheel-cache-j3xfd3uz/wheels/c3/ac/6b/1d3cf19b0c1e8cdfd15cb65de3fa8f596f68e3aade9e096016
Successfully built seqrep
Installing collected packages: seqrep
Successfully installed seqrep-0.0.0


## Import Needed Parts

In [2]:
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import MinMaxScaler
from sklearn.svm import SVC

from seqrep.feature_engineering import PreviousValuesExtractor
from seqrep.labeling import NextColorLabeler
from seqrep.splitting import TrainTestSplitter
from seqrep.evaluation import ClassificationEvaluator
from seqrep.pipeline_evaluation import PipelineEvaluator

#Data Source
!pip install yfinance
import yfinance as yf

Collecting yfinance
  Downloading yfinance-0.1.63.tar.gz (26 kB)
Collecting lxml>=4.5.1
  Downloading lxml-4.6.3-cp37-cp37m-manylinux2014_x86_64.whl (6.3 MB)
[K     |████████████████████████████████| 6.3 MB 33.4 MB/s 
Building wheels for collected packages: yfinance
  Building wheel for yfinance (setup.py) ... [?25l[?25hdone
  Created wheel for yfinance: filename=yfinance-0.1.63-py2.py3-none-any.whl size=23918 sha256=717bb69b885ed9ab1cc9976cfe7e6b9844e611903227e25cbfa1c4857303337e
  Stored in directory: /root/.cache/pip/wheels/fe/87/8b/7ec24486e001d3926537f5f7801f57a74d181be25b11157983
Successfully built yfinance
Installing collected packages: lxml, yfinance
  Attempting uninstall: lxml
    Found existing installation: lxml 4.2.6
    Uninstalling lxml-4.2.6:
      Successfully uninstalled lxml-4.2.6
Successfully installed lxml-4.6.3 yfinance-0.1.63


## Load Data
In this example, we will use the price data of *Apple shares* from *Yahoo-Finance*.

In [3]:
data = yf.download(tickers = 'AAPL' ,period ='10000d', interval = '1d')
# column names have to be lowercase
data.columns = data.columns.str.lower()
data

[*********************100%***********************]  1 of 1 completed


Unnamed: 0_level_0,open,high,low,close,adj close,volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1982-01-25,0.090402,0.090402,0.089844,0.089844,0.070420,44710400
1982-01-26,0.087612,0.087612,0.086496,0.086496,0.067796,21212800
1982-01-27,0.087054,0.088170,0.087054,0.087054,0.068234,31360000
1982-01-28,0.089844,0.090402,0.089844,0.089844,0.070420,39603200
1982-01-29,0.090960,0.091518,0.090960,0.090960,0.071295,53155200
...,...,...,...,...,...,...
2021-09-15,148.559998,149.440002,146.369995,149.029999,149.029999,83281300
2021-09-16,148.440002,148.970001,147.220001,148.789993,148.789993,68034100
2021-09-17,148.820007,148.820007,145.759995,146.059998,146.059998,129728700
2021-09-20,143.800003,144.839996,141.270004,142.940002,142.940002,123304600


## Run Pipeling with Evaluation

In [5]:
%%capture --no-stdout --no-display
# 1. step
pipe = Pipeline([('fext_prev', PreviousValuesExtractor()),
                 ('scale', MinMaxScaler()),
                 ('svc', SVC()),
                 ])
# 2. step
pipe_eval = PipelineEvaluator(labeler = NextColorLabeler(),
                              splitter = TrainTestSplitter(),
                              pipeline = pipe,
                              evaluator = ClassificationEvaluator(),
                              )
# 3. step
result = pipe_eval.run(data=data)

14:51:48.651244 Labeling data
14:51:48.653801 Splitting data
14:51:48.656534 Fitting pipeline
14:51:51.501784 Predicting
14:51:52.164371 Evaluating predictions
[[1092  119]
 [1183  106]] 
 47.92 % accuracy
 47.11111111111111 % precision of 1 classes
 8.223429014740109 % recall of 1 classes

              precision    recall  f1-score   support

           0       0.48      0.90      0.63      1211
           1       0.47      0.08      0.14      1289

    accuracy                           0.48      2500
   macro avg       0.48      0.49      0.38      2500
weighted avg       0.48      0.48      0.38      2500

