<center>
<a href="https://github.com/kamu-data/kamu-cli">
<img alt="kamu" src="https://raw.githubusercontent.com/kamu-data/kamu-cli/master/docs/readme_files/kamu_logo.png" width=270/>
</a>
</center>

<br/>

<center><i>World's first decentralized real-time data warehouse, on your laptop</i></center>

<br/>

<div align="center">
<a href="https://github.com/kamu-data/kamu-cli">Repo</a> | 
<a href="https://docs.kamu.dev/cli/">Docs</a> | 
<a href="https://docs.kamu.dev/cli/learn/learning-materials/">Tutorials</a> | 
<a href="https://docs.kamu.dev/cli/learn/examples/">Examples</a> |
<a href="https://docs.kamu.dev/cli/get-started/faq/">FAQ</a> |
<a href="https://discord.gg/nU6TXRQNXC">Discord</a> |
<a href="https://kamu.dev">Website</a>
</div>

<center>

<br/>
<br/>
    
# 1. Working with Web3 data

</center>

# Introduction

No matter if you are Blockchain-cautios, Web3-courious, or already own a mansion in the Metaverse - you'll probably agree that the Distributed Ledgers (Blockchains) are quickly becoming **very large sources of various data**. And we really like data!

Web 2/3 worlds of data, hovever, are still very fragmented:

* Web2 data sits in thousands of silos, in hundreds of different formats, or hidden behind custom JSON APIs which take weeks to integrate. We are still very far from achieving interoperability of data even in the Web2 world alone.

* Web3 data from public ledgers, despite being freely available, presents an big challenge in terms of its volumes and non-friendly to Data Science formats. Some projects like [The Graph](https://thegraph.com/) and [Dune](https://dune.com/) make it more accessible, but don't offer much help when you want to combine data from Web2 and Web3. Any even slightly more advanced use case often requires developing your own data ingestion infrastructure which most of us cannot afford.

What if we could solve both of these problems with a **single technology**? Make data overall much more easily **accessible** and **interoperable**, and erase the boundary between **off- and on-chain data**.

In this demo we will see how `kamu`'s data pipelines can make this possible, allowing you to share data using modern Web3 decentralized storage systems, and letting you easily combine on- and off-chain data within a single query.

<div class="alert alert-block alert-info">

This demo is intended to be standalone, but if at any point in time you feel lost you might want to revisit the _"Kamu Basics"_ chapter first. You are also very welcome on our [Discord](https://discord.gg/nU6TXRQNXC) or can create an issue in [kamu-cli](https://github.com/kamu-data/kamu-cli) GitHub repository to get help.

</div>

### Use Case

But first we need to pick a use case, so why don't we do some **personal finance**?

You like financial planning, don't you?

Neither do we... Being a grown up and having to deal with multiple bank accounts and retirement plans - what can be more boring?

But perhaps you spice things up by holding some cryptocurrency ... except now all your money are **spread over multiple different institutions and wallets** and it's very easy to **lose track** of your overall financial situation.

Most tools that banks and wallet apps offer are already mediocre, but now they are of no use at all as they only show you small parts of the whole picture.

<div class="alert alert-block alert-info">

Fun fact: `kamu` started in 2018 as a huge set of scripts that ingested data from multiple bank, retirement, and investment accounts, unified all currencies, and analyzed the performance of investments over time. This pipeline was so painful to maintain that we started to look for a better, fully autonomous solution.

</div>

For this demo let's assume that you **had some Ethereum**. To get more upside while holding it you decided to **"stake" it in the [Rocketpool](https://rocketpool.net/)**.

<div class="alert alert-block alert-warning">
<details>
<summary style="display:list-item"><b>What is Staking?</b></summary>

Staking is when you lock up some of your Ethereum as a collateral and to become a validator of ledger transactions. Staking pools like _Rocketpool_ allow you to invest any amount of ETH and let other people operate the transaction validator nodes for you while you all share the validation rewards.

</details>
</div>

Few month later you start wondering:
- Was that a good investment?
- How much is it worth now and what is the return?
- How did it perform over time compared to other things you invest in?

These questions are so easy to ask, but **so hard to answer**!

Throughout this demo we will create a personal data pipeline that can not only provide you an answer, but one that can **constantly stay up-to-date**, giving you full awareness of your portfolio's performance.

<div class="alert alert-block alert-danger">

This demo should not be taken as a financial advice or a comment on cryptocurrency - we are only interested in data science aspects of it.

</div>


Our first pipeline will look like this:

![blah](files/pipeline-1.png)

```
┌───────────────────────────────────────┬──────────────┬─────────────────────────────────────────────────┐
│                 Name                  │     Kind     │                  Description                    │
├───────────────────────────────────────┼──────────────┼─────────────────────────────────────────────────┤
│ net.rocketpool.reth.mint-burn         │ Remote(Root) │ rETH transactions (pulled from IPFS)            │
│ com.cryptocompare.ohlcv.eth-usd       │ Remote(Root) │ ETH to USD exchange rate (pulled from IPFS)     │
│ account.tokens.transfers              │     Root     │ Wallet token transfers (sourced from Etherscan) │
│ account.transactions                  │     Root     │ Wallet transactions (sourced from Etherscan)    │
│ account.tokens.portfolio              │  Derivative  │ Tokens portfolio with book prices & amount held │
│ account.tokens.portfolio.usd          │  Derivative  │ Tokens portfolio with USD book prices           │
│ account.tokens.portfolio.market-value │  Derivative  │ Tokens portfolio market value in ETH and USD    │
└───────────────────────────────────────┴──────────────┴─────────────────────────────────────────────────┘
```

# Getting datasets from IPFS
When you stake your `ETH` in Rocketpool - the Smart Contract takes your `ETH` and issues you a corresponding amount of `rETH` tokens to represent your stake.

Instead of periodically sending you more `rETH`, your staking rewards are "delivered" by changing the exchange rate between `ETH` and `rETH`, e.g. if you paid `1 ETH` for `1 rETH` in 2021, in 2022 you could sell `1 rETH` for `1.024 ETH` i.e. a 2.4% gain.

Let's try to visualize these exchange rates.

<div class="alert alert-block alert-success">

First, we initialize our workspace:
    
<p style="background:black">
<code style="background:black;color:white">cd "02 - Web3 Data (Ethereum trading example)"
kamu init
</code>
</p>
</div>

Since Ethereum is an open data source - we can find out the exchange rates by simply looking at all blockchain transactions involving `rETH` contract and seeing how much people buy and sell it for.

We believe that <mark>getting data should be easy</mark>, so before we explain how data gets into the network (see chapter XXXXXXXXXX), let's see **how easy it is to get data that's already in Open Data Fabric**.

<div class="alert alert-block alert-success">
    
Run the following command:

<p style="background:black">
<code style="background:black;color:white">kamu pull "ipns://net.rocketpool.reth.mint-burn.ipns.kamu.dev" --as net.rocketpool.reth.mint-burn
</code>
</p>
</div>

<div class="alert alert-block alert-warning">
Pulling from IPFS may take a few minutes, so if you'd like to sacrifice the full "decentralized data experience" for speed you can also pull data from S3:
    
<p style="background:black">
<code style="background:black;color:white">kamu pull "https://s3.us-west-2.amazonaws.com/datasets.kamu.dev/odf/v1/contrib/net.rocketpool.reth.mint-burn"
</code>
</p>
</div>

Lots of cool things are happening in this one command:
- Beforehand we prepared a [rETH transactions](https://github.com/kamu-data/kamu-cli/blob/master/images/demo/user-home/02%20-%20Web3%20Data%20%28Ethereum%20trading%20example%29/datasets/rocketpool.reth.mint-burn.yaml) dataset for you
- A [periodic job](https://github.com/kamu-data/kamu-contrib/actions) uses `kamu` to ingests new data from an Ethereum node
- Dataset is stored in [IPFS](https://ipfs.io) - an "Inter-Planetary File System"
- Using DNS everyone can refer to this dataset as `ipns://net.rocketpool.reth.mint-burn.ipns.kamu.dev`
- The DNS record resolves into an IPFS hash (`CID`) of the latest version of the dataset
- `kamu` uses the CID to download the entire dataset block-by-block
- Next time you do `kamu pull net.rocketpool.reth.mint-burn` only the new blocks will be downloaded (i.e. a minimal update)

So we are pulling <mark>**public ledger**</mark> data that is parsed into <mark>**analytical data format**</mark> by <mark>**verifiable code**</mark> and stored in a <mark>**globally-decentralized file system**</mark> as a <mark>**near-real-time data stream**</mark>.

...Neat!

<div class="alert alert-block alert-info">

Try out the following commands (add `--help` to read what they do):
    
<p style="background:black">
<code style="background:black;color:white">kamu list
kamu tail net.rocketpool.reth.mint-burn
kamu log net.rocketpool.reth.mint-burn
kamu inspect schema net.rocketpool.reth.mint-burn
kamu repo alias list
</code>
</p>
</div>

Data is in, now let's visualize it.

If you run `kamu tail` - you can see that this dataset contains all individual transactions that involved rETH token.

Based on the [ERC-20 Token Standard](https://ethereum.org/en/developers/docs/standards/tokens/erc-20/) we know that when token is issued in exchange for `ETH` - the `TokensMinted` event is present in Ethereum transaction logs, and when `rETH` is exchanged back into `ETH` - we expect the `TokensBurned` event.

Using this we can now create the **instantaneous buy/sell exchange rate graph**:

<div class="alert alert-block alert-warning">
<details>
<summary style="display:list-item">Need a quick refresher on using <b>kamu's Jupyter notebooks</b>?</summary>

Jupyter notebook you're using now runs either on our demo server (https://demo.kamu.dev) or can be launched with `kamu notebook` command in your own workspace when you have the tool installed.
    
To start working with data:
- First run `%load_ext kamu` to load our extension
- Then use `%import_dataset dataset_name` to import datasets from your workspace

Above commands will start the Apache Spark SQL server in the background and connect to it.
    
By default all code cells execute in PySpark environment, which is most of the time not what we want.
    
Instead we use `%%sql` cells to run SQL queries in Spark. It's a great way to explore and shape your data.
    
You can download the result of any SQL query into the notebook's Python process using `%%sql -o pandas_dataframe_variable -n records_limit`.
    
You can then use `%%local` cells to execute Python code inside the notebook to further process or visualize the data.
    
</details>
</div>

In [None]:
%%local
import pandas as pd
import hvplot.pandas
pd.set_option('max_colwidth', None)

In [None]:
%load_ext kamu
%import_dataset net.rocketpool.reth.mint-burn

In [None]:
%%sql
select * from `net.rocketpool.reth.mint-burn` limit 5

In [None]:
%%sql -o reth_pool -q

--## The -o <name> option above downloads the SQL query result
--## into the local notebook as Pandas dataframe
select 
    event_time, 
    case 
        when event_name = "TokensMinted" then "Mint"
        when event_name = "TokensBurned" then "Burn"
    end as event_name, 
    avg(eth_amount / amount) as rate
from `net.rocketpool.reth.mint-burn` 
group by event_time, event_name
order by 1

In [None]:
%%local
reth_pool.hvplot.step(
    x="block_time", 
    by="event_name", 
    width=900, height=600, 
    legend='top_left', grid=True, 
    title="ETH : rETH Ratio (Minting and Burning)",
)

From this we can tell that Rocketpool so far is fulfilling its promise of steady staking returns.

## Getting ETH to USD exchange rate

While we're at it, let's also use the same mechanism to get the ETH to USD exchange rate:

<div class="alert alert-block alert-success">
    
Pull existing dataset from IPFS:
    
<p style="background:black">
<code style="background:black;color:white">kamu pull "ipns://com.cryptocompare.ohlcv.eth-usd.ipns.kamu.dev" --as com.cryptocompare.ohlcv.eth-usd
</code>
</p>
</div>

<div class="alert alert-block alert-warning">
Pulling from IPFS may take a few minutes, so if you'd like to sacrifice the full "decentralized data experience" for speed you can also pull data from S3:
    
<p style="background:black">
<code style="background:black;color:white">kamu pull "https://s3.us-west-2.amazonaws.com/datasets.kamu.dev/odf/v1/contrib/com.cryptocompare.ohlcv.eth-usd"
</code>
</p>
</div>

In [None]:
%import_dataset com.cryptocompare.ohlcv.eth-usd

In [None]:
%%sql
select * from `com.cryptocompare.ohlcv.eth-usd` 
order by event_time desc 
limit 5

In [None]:
%%sql -o eth2usd -q
select * from `com.cryptocompare.ohlcv.eth-usd` order by event_time

In [None]:
%%local
eth2usd.hvplot.line(
    x="event_time",
    y="close",
    height=500, 
    width=800,
)

## Ingesting Account Data from Etherscan
Let's get data about our account now.

This dataset will be personalized, so we don't have it prepared. Instead, we will create our own Root datasets using data from the [Etherscan API](https://etherscan.io/) (free API tier will be enough for our needs).

<div class="alert alert-block alert-success">

Add datasets and pull data:

<p style="background:black">
<code style="background:black;color:white">kamu add datasets/account.tokens.transfers.yaml datasets/account.transactions.yaml
kamu pull account.tokens.transfers account.transactions
</code>
</p>
</div>

The key part of the `account.tokens.transfers` dataset manifest is:
```yaml
kind: DatasetSnapshot
version: 1
content:
  name: account.tokens.transfers
  kind: root
  metadata:
    - kind: setPollingSource
      fetch:
        kind: url
        url: "https://api.etherscan.io/api\
          ?module=account\
          &action=tokentx\
          &address=0xeadb3840596cabf312f2bc88a4bb0b93a4e1ff5f\
          &page=1\
          &offset=1000\
          &startblock=0\
          &endblock=99999999
      prepare: ...
      read: ...
      preprocess: ...
      merge:
        kind: ledger
        primaryKey:
          - transaction_hash
```

We are asking Etherscan to return us all ERC-20 token transactions involving account `0xeadb3840596cabf312f2bc88a4bb0b93a4e1ff5f` since the beginning of time (`startblock=0`) and merging them with existing data (if any) as using the `ledger` [merge strategy](https://docs.kamu.dev/cli/ingest/merge-strategies/).

<div class="alert alert-block alert-warning">

Here we are using some **random person's account address** who performed many rETH transactions.
    
We picked it for illustration purposes only, and once you're done with the demo you can get this pipeline and **substitute your own wallet address**!

</div>

In [None]:
%import_dataset account.tokens.transfers

In [None]:
%%sql
select * from `account.tokens.transfers` 
order by block_number desc
limit 5

In [None]:
%%sql
select
    token_name as `Token`, 
    sum(abs(value) / pow(10, token_decimal)) as `Volume Traded` 
from `account.tokens.transfers`
group by 1

<div class="alert alert-block alert-info">

Try switching to **Bar** and **Pie** visualization types above.

</div>

As you can see the `account.tokens.transfers` dataset gives us the **number of tokens transfered**, and by looking at the `from` / `to` addresses we can tell if token was given or taken away out from our account.

... But, we don't know **for how much** `ETH` the tokens were bought or sold for.

This is why we need the `account.transactions` dataset that contains all account transactions along with their `ETH` value.

In [None]:
%import_dataset account.transactions

In [None]:
%%sql
select *
from `account.transactions` 
order by block_number desc
limit 5

In [None]:
%%sql -o transactions -q
select
    *, 
    value / pow(10, 18) as value_eth 
from `account.transactions` 
order by block_number desc

In [None]:
%%local
transactions
transactions.hvplot.scatter(
    x="block_time",
    y="value_eth",
    title="Account Transactions in ETH",
    xlabel="Time",
    ylabel="ETH",
    color="red",
    alpha=0.5,
)

## Tracking token portfolio using derivative datasets

To understand our "portfolio" of tokens we would like to have a dataset that:
- Contains individual token transactions along with book/sell price in ETH
- Tracks cummulative number of tokens held per each type
- Tracks cummulative book price in ETH

<div class="alert alert-block alert-success">

We achieve this using the following derivative dataset:

<p style="background:black">
<code style="background:black;color:white">kamu add datasets/account.tokens.portfolio.yaml
kamu pull account.tokens.portfolio
</code>
</p>
</div>

The key parts of this dataset look like this:

```yaml
---
kind: DatasetSnapshot
version: 1
content:
  name: account.tokens.investments
  kind: derivative
  metadata:
    - kind: setTransform
      inputs:
        - name: account.tokens.transfers
        - name: account.transactions
      transform:
        kind: sql
        engine: flink
        queries:
          # Convert token transfers into (token_type, +/- delta) form
          - alias: token_transfers
            query: ...
          # Convert ETH transactions into (transaction, +/- delta) form
          - alias: transactions
            query: ...
          # JOIN the `token_transfers` and `transactions` datasets
          - alias: token_transactions
            query: |
              select
                tr.block_time,
                tr.block_number,
                tr.transaction_hash,
                tx.symbol as account_symbol,
                tr.token_symbol,
                tr.token_amount,
                tx.eth_amount
              from token_transfers as tr
              left join transactions as tx
              on 
                tr.transaction_hash = tx.transaction_hash
                and tr.block_time = tx.block_time
          # Use a window function to calculate cumulative balance and book value
          - alias: account.tokens.investments
            query: >
              select
                *,
                sum(token_amount) over (partition by token_symbol order by block_time) as token_balance,
                sum(-eth_amount) over (partition by token_symbol order by block_time) as token_book_value_eth
              from token_transactions
```
Remember that this is a **Streaming SQL** - we are not joining tables, but rather two potentially real-time and infinite streams of data. 

This particular type is a [Stream-to-Stream JOIN](https://docs.kamu.dev/cli/transform/joins-s2s/).

In the next chapter we will explore why stream processing model is such a big deal.

In [None]:
%import_dataset account.tokens.portfolio

In [None]:
%%sql -o portfolio -q
select * from `account.tokens.portfolio` 

In [None]:
%%local
portfolio[
    portfolio.token_symbol == "rETH"
].hvplot.scatter(
    x="block_time",
    y="token_amount",
    color="orange",
    title="rETH Buy/Sell Transactions",
)

In [None]:
%%local
r = portfolio[
    portfolio.token_symbol == "rETH"
]
r.hvplot.step(
    x="block_time",
    xlabel="Time",
    y="token_balance",
    ylabel="rETH",
    title="rETH Amount Held",
) * r.hvplot.scatter(
    x="block_time",
    y="token_balance",
    c="k",
    alpha=0.5,
)

## Portfolio Market Value
The questions we would like to answer next are:
- What our token portfolio's **market value in ETH**
- What are the approximate **book and market values in USD**

For the last one we will start with an intermidiate step (that will help us later) and create a derivative dataset with book values in USD per every portfolio transaction.

<div class="alert alert-block alert-success">

Add and pull the prepared dataset:
    
<p style="background:black">
<code style="background:black;color:white">kamu add datasets/account.tokens.portfolio.usd.yaml
kamu pull account.tokens.portfolio.usd
</code>
</p>
</div>

Here's how this dataset is defined:

```yaml
---
kind: DatasetSnapshot
version: 1
content:
  name: account.tokens.portfolio.usd
  kind: derivative
  metadata:
    - kind: setTransform
      inputs:
        - name: account.tokens.portfolio
        - name: com.cryptocompare.ohlcv.eth-usd
      transform:
        kind: sql
        engine: flink
        # Set up temporal table functions that turn our stream of echange rates
        # into a 3-dimenisonal (rows + columns + time) lookup table
        temporalTables:
          - name: com.cryptocompare.ohlcv.eth-usd
            primaryKey:
              - from_symbol
        queries:
          - alias: with_usd_amount
            # Use Temporal Table JOIN to convert ETH to USD using exchange rate
            # at the time of each individual transaction
            query: |
              select
                tr.block_time,
                tr.block_number,
                tr.transaction_hash,
                tr.account_symbol,
                tr.token_symbol,
                tr.token_amount,
                tr.eth_amount,
                tr.token_balance,
                tr.token_book_value_eth,
                'usd' as account_anchor_symbol,
                (
                  tr.eth_amount * eth2usd.`close`
                ) as eth_amount_as_usd
              from `account.tokens.portfolio` as tr
              join `com.cryptocompare.ohlcv.eth-usd` for system_time as of tr.block_time as eth2usd
              on tr.account_symbol = eth2usd.from_symbol and eth2usd.to_symbol = 'usd'
          # Cummulative sum to derive the book value in USD
          - alias: account.tokens.portfolio.usd
            query: |
              select
                *,
                sum(-eth_amount_as_usd) over (partition by token_symbol order by block_time) as token_book_value_eth_as_usd
              from with_usd_amount
    - kind: setVocab
      eventTimeColumn: block_time
```

When converting between two currencies it's common for accountants to use some **average exchange rates** for a moth or event a whole year periods. 

<mark>This sacrifices accuracy for the sake of simplicity</mark> and would work poorly for cryptocurrencies that still exhibit a lot of volatility. 

Can we get **both accuracy and simpliticy**, so that for every single transaction we used the exchange rate as it was **at the time of that transaction**?

This is where a [Temporal-Table JOIN](https://docs.kamu.dev/cli/learn/examples/stock-trading/#calculating-current-market-value-of-held-positions) can help us. It transforms an exchange rate stream into a kind of a lookup table which can be indexed by time to get the appropriate exhcange rate.

Now we have a very rich dataset containing detailed information per every portfolio transaction, and also the **cumulative balance** of every position in the portfolio, i.e. the "portfolio state".

The **market value** is basically how much money we would get at different points in time if we decided to liquidate our entire portfolio. To produce it we will use the same exact type of JOIN as before, but instead of joining exchange rates onto transactions we flip the direction and join the state of our portfolion onto every exchange rate data point.

<div class="alert alert-block alert-success">

Add and pull the prepared dataset:

<p style="background:black">
<code style="background:black;color:white">&dollar; kamu add datasets/account.tokens.portfolio.market-value.yaml
&dollar; kamu pull account.tokens.portfolio.market-value
</code>
</p>
</div>

Here's how the market value dataset is defined:


```yaml
---
kind: DatasetSnapshot
version: 1
content:
  name: account.tokens.portfolio.market-value
  kind: derivative
  metadata:
    - kind: setTransform
      inputs:
        - name: account.tokens.portfolio.usd
        - name: net.rocketpool.reth.mint-burn
        - name: com.cryptocompare.ohlcv.eth-usd
      transform:
        kind: sql
        engine: flink
        temporalTables:
          - name: account.tokens.portfolio.usd
            primaryKey:
              - token_symbol
          - name: com.cryptocompare.ohlcv.eth-usd
            primaryKey:
              - from_symbol
        queries:
          # TODO: generate daily ticks?
          - alias: market_value_reth2eth
            query: |
              select
                rp.event_time,
                tr.account_symbol,
                tr.token_symbol,
                tr.token_balance,
                tr.token_book_value_eth,
                (
                  rp.eth_amount / rp.amount * tr.token_balance
                ) as token_market_value_eth,
                tr.token_book_value_eth_as_usd
              from `net.rocketpool.reth.mint-burn` as rp
              join `account.tokens.portfolio.usd` for system_time as of rp.event_time as tr
              on rp.token_symbol = tr.token_symbol
          - alias: account.tokens.portfolio.market-value
            query: |
              select
                rp.event_time,
                rp.account_symbol,
                rp.token_symbol,
                rp.token_balance,
                rp.token_book_value_eth,
                rp.token_market_value_eth,
                rp.token_book_value_eth_as_usd,
                (
                  rp.token_market_value_eth * eth2usd.`close`
                ) as token_market_value_usd
              from market_value_reth2eth as rp
              join `com.cryptocompare.ohlcv.eth-usd` for system_time as of rp.event_time as eth2usd
              on eth2usd.from_symbol = rp.account_symbol and eth2usd.to_symbol = 'usd'
```

In [None]:
%import_dataset account.tokens.portfolio.market-value

In [None]:
%%sql -o market_value -q
select * from `account.tokens.portfolio.market-value` 

In [None]:
%%local
market_value.hvplot.line(
    x="event_time", 
    y=["token_book_value_eth", "token_market_value_eth"],
    xlabel="Time",
    ylabel="ETH",
    legend="bottom_right",
    title="rETH: Book vs Market Value in ETH",
    height=500,
    width=800,
)

In [None]:
%%local
market_value.hvplot.line(
    x="event_time",
    y=["token_book_value_eth_as_usd", "token_market_value_usd"],
    xlabel="Time",
    ylabel="USD",
    legend="bottom_right",
    title="rETH: Book vs Market Value in USD",
    height=500,
    width=800,
)

---

## Summary

Phew... We've covered a lot of steps!

To recap, here's the **outline of the pipeline** we just created:

![blah](files/pipeline-1.png)

```
┌───────────────────────────────────────┬──────────────┬─────────────────────────────────────────────────┐
│                 Name                  │     Kind     │                  Description                    │
├───────────────────────────────────────┼──────────────┼─────────────────────────────────────────────────┤
│ net.rocketpool.reth.mint-burn         │ Remote(Root) │ rETH transactions (pulled from IPFS)            │
│ com.cryptocompare.ohlcv.eth-usd       │ Remote(Root) │ ETH to USD exchange rate (pulled from IPFS)     │
│ account.tokens.transfers              │     Root     │ Wallet token transfers (sourced from Etherscan) │
│ account.transactions                  │     Root     │ Wallet transactions (sourced from Etherscan)    │
│ account.tokens.portfolio              │  Derivative  │ Tokens portfolio                                │
│ account.tokens.portfolio.usd          │  Derivative  │ Tokens portfolio with USD book prices           │
│ account.tokens.portfolio.market-value │  Derivative  │ Tokens portfolio market value in ETH and USD    │
└───────────────────────────────────────┴──────────────┴─────────────────────────────────────────────────┘
```

Surely putting this pipeline togeter takes time. Things get much faster as you get more experience with different types of streaming JOINs. They get much-much faster if you collaborate and reuse pipelines made by others.

The good thing is, whether you're ingesting external data or building processing pipelines with `kamu`, **you only have to do it once**. While data is flowing, your queries will continue to produce **up-to-date results with minimal maintenance effort**.

We will cover some advanced aspects of why streaming pipelines are much more autonomous than batch in the next chapter, so please follow along!