# <span style="font-width:bold; font-size: 3rem; color:#1EB182;"><img src="../images/icon102.png" width="38px"></img> **Hopsworks Feature Store** </span><span style="font-width:bold; font-size: 3rem; color:#333;">- Part 02: Feature Pipeline</span>

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/bitcoin/2_feature_pipeline.ipynb)


## 🗒️ This notebook is divided in 3 sections:
1. Parsing Data.
2. Feature Group Insertion.

### <span style="color:#ff5f27;"> 📝 Imports</span>

In [1]:
from functions import *

from dotenv import load_dotenv
load_dotenv()

[nltk_data] Downloading package stopwords to /Users/Max/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[nltk_data] Downloading package wordnet to /Users/Max/nltk_data...
[nltk_data]   Package wordnet is already up-to-date!
[nltk_data] Downloading package omw-1.4 to /Users/Max/nltk_data...
[nltk_data]   Package omw-1.4 is already up-to-date!


True

In [2]:
import hopsworks

project = hopsworks.login()

fs = project.get_feature_store()

Connected. Call `.close()` to terminate connection gracefully.

Logged in to project, explore it here https://c.app.hopsworks.ai:443/p/167
Connected. Call `.close()` to terminate connection gracefully.




## <span style="color:#ff5f27;"> 🧙🏼‍♂️ Parsing Data</span>

### <span style='color:#ff5f27'> 📈 Bitcoin Data

In [3]:
df_bitcoin = parse_btc_data(number_of_days_ago=60)
df_bitcoin.head(3)

2022-09-28 16:57:02,993 INFO: New instance of unicorn-binance-rest-api_1.5.0-python_3.9.13 on Darwin 20.6.0 for exchange False started ...
2022-09-28 16:57:02,995 INFO: Initiating `colorama_0.4.5`


Unnamed: 0,date,open,high,low,close,volume,quote_av,trades,tb_base_av,tb_quote_av,unix
0,2022-07-30 00:00:00,23777.28,24668.0,23502.25,23643.51,151060.13211,3634601000.0,4801528,75778.59282,1823456000.0,1659128400000
1,2022-07-31 00:00:00,23644.64,24194.82,23227.31,23293.32,127743.32483,3028386000.0,4463721,63737.43859,1511326000.0,1659214800000
2,2022-08-01 00:00:00,23296.36,23509.68,22850.0,23268.01,144210.16219,3346372000.0,4775213,71458.39583,1658446000.0,1659301200000


In [4]:
df_bitcoin_processed = process_btc_data(df_bitcoin)
df_bitcoin_processed.tail(3)

Unnamed: 0,date,open,high,low,close,volume,quote_av,trades,tb_base_av,tb_quote_av,...,exp_std_14_days,momentum_14_days,rate_of_change_14_days,strength_index_14_days,std_56_days,exp_mean_56_days,exp_std_56_days,momentum_56_days,rate_of_change_56_days,strength_index_56_days
58,2022-09-26,18809.13,19318.96,18680.72,19227.82,439239.21943,8356952000.0,5837041,220623.29914,4198006000.0,...,849.990412,-3167.92,-4.688065,44.106314,1808.341707,20451.720894,1601.899389,-4040.19,-16.356379,41.431348
59,2022-09-27,19226.68,20385.86,18816.32,19079.13,593260.74161,11705770000.0,8152473,296727.71059,5856646000.0,...,801.416277,-1094.44,-5.673587,43.033766,1812.655631,20397.16026,1592.846495,-3908.66,-16.386972,41.179752
60,2022-09-28,19078.1,19238.28,18471.28,19154.64,309292.11375,5834608000.0,4400104,153801.87689,2901864000.0,...,750.719506,-1072.07,-2.777603,43.781424,1815.932383,20347.999156,1579.974558,-3663.73,-15.331048,41.363866


### <span style='color:#ff5f27'> 💭 Tweets Data

In [5]:
df_tweets_parsed = get_last_tweets()
df_tweets_parsed.head()



Unnamed: 0,date,text
0,2022-09-23 00:26:04+00:00,Fiat money printing hides incompetence at all ...
1,2022-09-23 00:51:41+00:00,Why is #bitcoin dropping? And will it fall bel...
2,2022-09-23 04:37:01+00:00,The colors of money What is your color? #bitco...
3,2022-09-23 05:38:31+00:00,It’s time to separate money from state. #Bitco...
4,2022-09-23 06:00:51+00:00,"Sometimes, it's just sad to see people who don..."


In [6]:
tweets_textblob = textblob_processing(df_tweets_parsed)
tweets_textblob.head()

Unnamed: 0,date,subjectivity,polarity,unix
0,2022-09-20 00:00:00,7.855706,1.711442,1663621200000
1,2022-09-21 00:00:00,7.413795,1.76878,1663707600000
2,2022-09-22 00:00:00,2.998512,0.786066,1663794000000
3,2022-09-23 00:00:00,8.131728,1.814836,1663880400000
4,2022-09-24 00:00:00,2.929545,1.236364,1663966800000


In [7]:
tweets_vader = vader_processing(df_tweets_parsed)
tweets_vader.head()

100%|███████████████████████████████████████| 157/157 [00:00<00:00, 4558.14it/s]


Unnamed: 0,date,compound,unix
0,2022-09-20 00:00:00,2.2292,1663621200000
1,2022-09-21 00:00:00,3.6847,1663707600000
2,2022-09-22 00:00:00,1.5785,1663794000000
3,2022-09-23 00:00:00,1.3411,1663880400000
4,2022-09-24 00:00:00,-0.4297,1663966800000


---
## <span style="color:#ff5f27;"> 📡 Connecting to Hopsworks Feature Store </span>

In [8]:
import hopsworks

project = hopsworks.login()

fs = project.get_feature_store()

Connection closed.
Connected. Call `.close()` to terminate connection gracefully.

Logged in to project, explore it here https://c.app.hopsworks.ai:443/p/167
Connected. Call `.close()` to terminate connection gracefully.


## <span style="color:#ff5f27;">⬆️ Uploading new data to the Feature Store</span>

### <span style='color:#ff5f27'> 📈 Bitcoin Feature Group

In [9]:
btc_price_fg = fs.get_or_create_feature_group(
    name='bitcoin_price',
    version=1
)

btc_price_fg.insert(df_bitcoin_processed)

Uploading Dataframe: 0.00% |          | Rows 0/61 | Elapsed Time: 00:00 | Remaining Time: ?

Launching offline feature group backfill job...
Backfill Job started successfully, you can follow the progress at 
https://c.app.hopsworks.ai/p/167/jobs/named/bitcoin_price_1_offline_fg_backfill/executions


(<hsfs.core.job.Job at 0x7f96e6a80c70>, None)

### <span style='color:#ff5f27'> 💭 Tweets Feature Groups

In [10]:
tweets_textblob_fg = fs.get_or_create_feature_group(
    name='bitcoin_tweets_textblob',
    version=1
)

tweets_textblob_fg.insert(tweets_textblob)

Uploading Dataframe: 0.00% |          | Rows 0/9 | Elapsed Time: 00:00 | Remaining Time: ?

Launching offline feature group backfill job...
Backfill Job started successfully, you can follow the progress at 
https://c.app.hopsworks.ai/p/167/jobs/named/bitcoin_tweets_textblob_1_offline_fg_backfill/executions


(<hsfs.core.job.Job at 0x7f96e71b4a30>, None)

In [11]:
tweets_vader_fg = fs.get_or_create_feature_group(
    name='bitcoin_tweets_vader',
    version=1
)

tweets_vader_fg.insert(tweets_vader)

Uploading Dataframe: 0.00% |          | Rows 0/9 | Elapsed Time: 00:00 | Remaining Time: ?

Launching offline feature group backfill job...
Backfill Job started successfully, you can follow the progress at 
https://c.app.hopsworks.ai/p/167/jobs/named/bitcoin_tweets_vader_1_offline_fg_backfill/executions


(<hsfs.core.job.Job at 0x7f96ebe15d60>, None)

---