# Training stocks model

Steps
* [Create feature vector](#Create-feature-vector)
* [Train locally using mlrun pytorch framework](#Train-locally-using-mlrun-pytorch-framework)

In [1]:
# !pip install -U torch

In [2]:
import mlrun
project = mlrun.get_or_create_project(name='stocks',user_project=True, context="src/")

> 2022-09-07 13:09:56,257 [info] loaded project stocks from MLRun DB


## Create feature vector

In [3]:
# Define the list of features we will be using
features = ['stocks.*',
            'news.sentiment',
            ]

# Import MLRun's Feature Store
import mlrun.feature_store as fstore

# Define the feature vector name for future reference
fv_name = 'stocks'

# Define the feature vector using our Feature Store (fstore)
transactions_fv = fstore.FeatureVector(fv_name, 
                          features, 
                          description='stocks information')

# Save the feature vector in the Feature Store
transactions_fv.save()

In [4]:
# Get offline feature vector as dataframe and save the dataset to parquet
import datetime
start_time = datetime.datetime.now()-datetime.timedelta(59)
end_time = datetime.datetime.now()-datetime.timedelta(0)
train_dataset = fstore.get_offline_features(fv_name,start_time=start_time,end_time=end_time, entity_timestamp_column = 'Datetime')
#train_dataset = fstore.get_offline_features(fv_name,with_indexes=True, entity_timestamp_column = 'Datetime')
df = train_dataset.to_dataframe()
df

Unnamed: 0,Open,High,Low,Close,Volume,ticker2onehot_A,ticker2onehot_AAL,ticker2onehot_AAP,ticker2onehot_AAPL,ticker2onehot_ABBV,ticker2onehot_ABC,ticker2onehot_ABMD,ticker2onehot_ABT,ticker2onehot_ACN,ticker2onehot_ADBE,sentiment
0,263.880005,265.320007,263.880005,265.250000,6949,0,0,0,0,0,0,1,0,0,0,
1,182.899994,183.619995,182.335999,183.619995,10228,0,0,1,0,0,0,0,0,0,0,
2,145.669998,145.850006,145.130005,145.419998,2896147,0,0,0,1,0,0,0,0,0,0,
3,386.570007,387.765015,385.130005,386.730011,44939,0,0,0,0,0,0,0,0,0,1,
4,108.599998,108.989998,108.599998,108.849998,125892,0,0,0,0,0,0,0,1,0,0,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
31105,129.300003,129.300003,129.300003,129.300003,0,1,0,0,0,0,0,0,0,0,0,0.5
31106,368.299988,368.299988,368.299988,368.299988,0,0,0,0,0,0,0,0,0,0,1,1.0
31107,263.890015,263.890015,263.890015,263.890015,0,0,0,0,0,0,0,1,0,0,0,1.0
31108,263.890015,263.890015,263.890015,263.890015,0,0,0,0,0,0,0,1,0,0,0,1.0


## Train locally using mlrun pytorch framework 

In [5]:
fn = mlrun.code_to_function('train_stocks_model', kind='job',image='mlrun/ml-models',handler='handler', filename='src/train_stocks.py').apply(mlrun.auto_mount())

In [6]:
import os

params = {'hidden_dim':2,
          'n_layers':1,
          'epochs':3, 
          'vector_name':'stocks',
          'seq_size':5,
          'start_time':59,
          'end_time':0,
          'batch_size':1,
          'model_filepath':os.path.join(os.getcwd(),'src')}

fn.run(local=True,watch=True, params = params)

> 2022-09-07 13:09:59,042 [info] starting run train-stocks-model-handler uid=bce8d07992db443ab9c002bdc6105eac DB=http://mlrun-api:8080


2022-09-07 13:10:19.378107: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2022-09-07 13:10:19.378178: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.


Epoch 1/3:
Training: 100% |██████████| 31050/31050 [00:59<00:00, 521.42Batch/s, MSELoss=0.0117, accuracy=0.892]  
Validating: 100% |██████████| 31050/31050 [00:18<00:00, 1708.03Batch/s, MSELoss=0.000218, accuracy=0.985]

Summary:
+----------+---------------------+
| Metrics  |       Values        |
+----------+---------------------+
| MSELoss  | 0.02444005385041237 |
| accuracy | 0.9952536225318909  |
+----------+---------------------+

Epoch 2/3:
Training: 100% |██████████| 31050/31050 [00:59<00:00, 519.08Batch/s, MSELoss=1.8e-5, accuracy=0.996]  
Validating: 100% |██████████| 31050/31050 [00:17<00:00, 1754.26Batch/s, MSELoss=0.000114, accuracy=0.989]

Summary:
+----------+----------------------+
| Metrics  |        Values        |
+----------+----------------------+
| MSELoss  | 0.006512375082820654 |
| accuracy |   0.82273268699646   |
+----------+----------------------+

Epoch 3/3:
Training: 100% |██████████| 31050/31050 [01:00<00:00, 510.05Batch/s, MSELoss=0.000363, accuracy=0.981

project,uid,iter,start,state,name,labels,inputs,parameters,results,artifacts
stocks-dani,...c6105eac,0,Sep 07 13:09:59,completed,train-stocks-model-handler,v3io_user=danikind=owner=danihost=jupyter-dani-7dc4b6c678-5k58j,,hidden_dim=2n_layers=1epochs=3vector_name=stocksseq_size=5start_time=59end_time=0batch_size=1model_filepath=/User/test/demos/stocks-prediction/src,hidden_dim=2n_layers=1epochs=3vector_name=stocksseq_size=5start_time=59end_time=0batch_size=1model_filepath=/User/test/demos/stocks-prediction/srclr=0.0001training_MSELoss=0.00036267098039388657training_accuracy=0.9809560775756836validation_MSELoss=0.0028345854952931404validation_accuracy=0.9901507049798965,training_MSELoss.htmltraining_accuracy.htmlvalidation_MSELoss.htmlvalidation_accuracy.htmlMSELoss_summary.htmlaccuracy_summary.htmllr_values.htmlstocks_model_custom_objects_map.jsonstocks_model_custom_objects.zipmodel





> 2022-09-07 13:16:22,310 [info] run executed, status=completed


<mlrun.model.RunObject at 0x7efc33488d50>