<a id="introduction"></a>
## RAPIDS Foundations
#### By Paul Hendricks
-------

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

In this notebook, we will also show how to get started with GPU DataFrames using cuDF and Dask cuDF in RAPIDS.

**Table of Contents**

* [RAPIDS Foundations](#introduction)
* [Setup](#setup)
* [cuDF DataFrames](#cudfdataframes)
* [Working with cuDF DataFrames using Dask](#working)
* [Dask cuDF DataFrames](#daskcudffundamentals)
* [Conclusion](#conclusion)

<a id="setup"></a>
## Setup

This notebook was tested using the following Docker containers:

* `rapidsai/rapidsai:0.6-cuda10.0-devel-ubuntu18.04-gcc7-py3.7` from [DockerHub](https://hub.docker.com/r/rapidsai/rapidsai)
* `rapidsai/rapidsai-nightly:0.6-cuda10.0-devel-ubuntu18.04-gcc7-py3.7` from [DockerHub](https://hub.docker.com/r/rapidsai/rapidsai-nightly)

This notebook was run on the NVIDIA Tesla V100 GPU. Please be aware that your system may be different and you may need to modify the code or install packages to run the below examples. 

If you think you have found a bug or an error, please file an issue here: https://github.com/rapidsai/notebooks/issues

Before we begin, let's check out our hardware setup by running the `nvidia-smi` command.

In [None]:
!nvidia-smi

Next, let's see what CUDA version we have:

In [None]:
!nvcc --version

<a id="cudfdataframes"></a>
## cuDF DataFrames

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

#### Creating a cudf.DataFrame using lists

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

In [None]:
df = cudf.DataFrame()
print(df)

In [None]:
# here we create two columns named "key" and "value"
df['key'] = [0, 1, 2, 3, 4]
df['value'] = [float(i + 10) for i in range(5)]
print(df)

In [None]:
ddf = dask_cudf.from_cudf(df, npartitions=4)

In [None]:
print(ddf)

In [None]:
print(ddf.head())

In [None]:
type(ddf)

#### Creating a cudf.DataFrame using a list of tuples or a dictionary

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

In [None]:
import numpy as np
from datetime import datetime, timedelta


ids = np.arange(5)
t0 = datetime.strptime('2018-10-07 12:00:00', '%Y-%m-%d %H:%M:%S')
datetimes = [(t0+ timedelta(seconds=x)) for x in range(5)]
dts = np.array(datetimes, dtype='datetime64')

In [None]:
df = cudf.DataFrame([('id', ids), ('datetimes', dts)])
print(df)

In [None]:
df = cudf.DataFrame({'id': ids, 'datetimes': dts})
print(df)

#### Loading from a Pandas DataFrame

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

In [None]:
import pandas as pd

In [None]:
pdf = pd.DataFrame({'a': [0, 1, 2, 3],'b': [0.1, 0.2, None, 0.3]})

In [None]:
df = cudf.from_pandas(pdf)
print(df)

#### Loading a CSV file

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

In [None]:
df = cudf.read_csv('../datasets/iris.csv')
print(df)

#### Loading a Parquet file

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

In [None]:
# df = cudf.read_csv('../datasets/iris.csv')
print(df)

#### Loading an ORC file

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

In [None]:
# df = cudf.read_csv('../datasets/iris.csv')
print(df)

#### Loading from S3

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

In [None]:
# df = cudf.read_csv('../datasets/iris.csv')
print(df)

#### Inspecting a cuDF DataFrame

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

#### Columns

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

#### Data types

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

#### Series

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

#### Index

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

#### Writing to a CSV file

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

In [None]:
# df = cudf.read_csv('../datasets/iris.csv')
print(df)

#### Writing to a Parquet file

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

In [None]:
# df = cudf.read_csv('../datasets/iris.csv')
print(df)

#### Writing to an ORC file

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

In [None]:
# df = cudf.read_csv('../datasets/iris.csv')
print(df)

<a id="working"></a>
## Working with cuDF DataFrames using Dask

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

Before we start working with cuDF DataFrames with Dask, we need to setup a Local CUDA Cluster and Client to work with our GPUs.

In [1]:
from dask.distributed import Client
from dask_cuda import LocalCUDACluster
import subprocess

# parse the hostname IP address
cmd = "hostname --all-ip-addresses"
process = subprocess.Popen(cmd.split(), stdout=subprocess.PIPE)
output, error = process.communicate()
ip_address = str(output.decode()).split()[0]

# create a local CUDA cluster
cluster = LocalCUDACluster(ip=ip_address)
client = Client(cluster)
client

  defaults = yaml.load(f)


0,1
Client  Scheduler: tcp://192.168.99.2:33298  Dashboard: http://192.168.99.2:8787/status,Cluster  Workers: 8  Cores: 8  Memory: 540.95 GB


Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

cuDF Version: 0.6.1+1.g9ca93255
Dask cuDF Version: 0.6.1+1.g9ca93255
NumPy Version: 1.15.4


Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

<a id="daskcudffundamentals"></a>
## Dask cuDF DataFrames

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

#### Reading data

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

In [4]:
from collections import OrderedDict
import os

base_path = os.path.join('/', 'datasets', 'rapids', 'mortgage', 'mortgage_2000_1gb')

dtypes = OrderedDict([
        ('loan_id', 'int64'),
        ('monthly_reporting_period', 'date'),
        ('servicer', 'category'),
        ('interest_rate', 'float64'),
        ('current_actual_upb', 'float64'),
        ('loan_age', 'float64'),
        ('remaining_months_to_legal_maturity', 'float64'),
        ('adj_remaining_months_to_maturity', 'float64'),
        ('maturity_date', 'date'),
        ('msa', 'float64'),
        ('current_loan_delinquency_status', 'int32'),
        ('mod_flag', 'category'),
        ('zero_balance_code', 'category'),
        ('zero_balance_effective_date', 'date'),
        ('last_paid_installment_date', 'date'),
        ('foreclosed_after', 'date'),
        ('disposition_date', 'date'),
        ('foreclosure_costs', 'float64'),
        ('prop_preservation_and_repair_costs', 'float64'),
        ('asset_recovery_costs', 'float64'),
        ('misc_holding_expenses', 'float64'),
        ('holding_taxes', 'float64'),
        ('net_sale_proceeds', 'float64'),
        ('credit_enhancement_proceeds', 'float64'),
        ('repurchase_make_whole_proceeds', 'float64'),
        ('other_foreclosure_proceeds', 'float64'),
        ('non_interest_bearing_upb', 'float64'),
        ('principal_forgiveness_upb', 'float64'),
        ('repurchase_make_whole_proceeds_flag', 'category'),
        ('foreclosure_principal_write_off_amount', 'float64'),
        ('servicing_activity_indicator', 'category')
    ])

In [5]:
import cudf; print('cuDF Version:', cudf.__version__)
import dask_cudf; print('Dask cuDF Version:', dask_cudf.__version__)
import numpy as np; print('NumPy Version:', np.__version__)


filepath = os.path.join(base_path, 'perf', 'Performance_*')
# filepath = os.path.join(base_path, 'perf', 'Performance_2000Q1.txt_0')
df = dask_cudf.read_csv(filepath, delimiter='|', 
                        names=list(dtypes.keys()), dtype=list(dtypes.values()))

cuDF Version: 0.6.1+1.g9ca93255
Dask cuDF Version: 0.6.1+1.g9ca93255
NumPy Version: 1.15.4


#### Inspecting a Dask cuDF DataFrame

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

In [6]:
df

Unnamed: 0_level_0,loan_id,monthly_reporting_period,servicer,interest_rate,current_actual_upb,loan_age,remaining_months_to_legal_maturity,adj_remaining_months_to_maturity,maturity_date,msa,current_loan_delinquency_status,mod_flag,zero_balance_code,zero_balance_effective_date,last_paid_installment_date,foreclosed_after,disposition_date,foreclosure_costs,prop_preservation_and_repair_costs,asset_recovery_costs,misc_holding_expenses,holding_taxes,net_sale_proceeds,credit_enhancement_proceeds,repurchase_make_whole_proceeds,other_foreclosure_proceeds,non_interest_bearing_upb,principal_forgiveness_upb,repurchase_make_whole_proceeds_flag,foreclosure_principal_write_off_amount,servicing_activity_indicator
npartitions=17,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1
,int64,datetime64[ms],int32,float64,float64,float64,float64,float64,datetime64[ms],float64,int32,int32,int32,datetime64[ms],datetime64[ms],datetime64[ms],datetime64[ms],float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,int32,float64,int32
,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...


Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

#### Ownership of objects

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

[Delayed(('read-csv-9f3774f9b87bf7e223821665363c3084', 0)),
 Delayed(('read-csv-9f3774f9b87bf7e223821665363c3084', 1)),
 Delayed(('read-csv-9f3774f9b87bf7e223821665363c3084', 2)),
 Delayed(('read-csv-9f3774f9b87bf7e223821665363c3084', 3)),
 Delayed(('read-csv-9f3774f9b87bf7e223821665363c3084', 4)),
 Delayed(('read-csv-9f3774f9b87bf7e223821665363c3084', 5)),
 Delayed(('read-csv-9f3774f9b87bf7e223821665363c3084', 6)),
 Delayed(('read-csv-9f3774f9b87bf7e223821665363c3084', 7)),
 Delayed(('read-csv-9f3774f9b87bf7e223821665363c3084', 8)),
 Delayed(('read-csv-9f3774f9b87bf7e223821665363c3084', 9)),
 Delayed(('read-csv-9f3774f9b87bf7e223821665363c3084', 10)),
 Delayed(('read-csv-9f3774f9b87bf7e223821665363c3084', 11)),
 Delayed(('read-csv-9f3774f9b87bf7e223821665363c3084', 12)),
 Delayed(('read-csv-9f3774f9b87bf7e223821665363c3084', 13)),
 Delayed(('read-csv-9f3774f9b87bf7e223821665363c3084', 14)),
 Delayed(('read-csv-9f3774f9b87bf7e223821665363c3084', 15)),
 Delayed(('read-csv-9f3774f9b87bf7

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

In [56]:
from dask.delayed import delayed


def head(dataframe):
    return dataframe.head()


dfs = [delayed(head)(d) for d in df_delayed]

In [58]:
from dask.distributed import wait

futures = client.compute(dfs)
wait(futures)
futures

[<Future: status: finished, type: DataFrame, key: head-e907ff47-c269-405a-8da5-2c97981dd222>,
 <Future: status: finished, type: DataFrame, key: head-22125ad3-6766-4906-a5b8-ad42de5c4590>,
 <Future: status: finished, type: DataFrame, key: head-95c33207-ec65-4a97-871e-383bd494bc99>,
 <Future: status: finished, type: DataFrame, key: head-02c9a5e7-ea67-4fdf-a771-1728c40e04e6>,
 <Future: status: finished, type: DataFrame, key: head-9c0cc252-3025-43bb-8051-a2c02825f995>,
 <Future: status: finished, type: DataFrame, key: head-6b395005-35b9-44d2-a8a5-3667834f90d7>,
 <Future: status: finished, type: DataFrame, key: head-969d221e-b689-4a27-b098-f91976efa5df>,
 <Future: status: finished, type: DataFrame, key: head-76ee5341-df55-47aa-ad82-ff53296be259>,
 <Future: status: finished, type: DataFrame, key: head-7ae76284-84b7-43d8-b3cd-9cb3e0f4c523>,
 <Future: status: finished, type: DataFrame, key: head-8727f33a-1753-455b-94d5-79637a238832>,
 <Future: status: finished, type: DataFrame, key: head-81c42

In [59]:
# results = [result.result() for future in futures]
results = client.gather(futures)

In [52]:
result_worker_map = {result: list(client.who_has(result).values())[0] for result in results}
import time; time.sleep(3)

In [53]:
result_worker_map

{<Future: status: finished, type: DataFrame, key: head-e12cff38-0859-425d-a231-97a0dea7c54d>: ('tcp://192.168.99.2:41185',),
 <Future: status: finished, type: DataFrame, key: head-e45a817a-2981-4540-a8a3-9b82f56944b1>: ('tcp://192.168.99.2:41185',),
 <Future: status: finished, type: DataFrame, key: head-5dd1cb43-a4a6-4af3-94b9-58dcb6afa372>: ('tcp://192.168.99.2:36531',),
 <Future: status: finished, type: DataFrame, key: head-46f8773c-a7df-4ec1-82e5-6a5218f1bf8d>: ('tcp://192.168.99.2:36531',),
 <Future: status: finished, type: DataFrame, key: head-1fcbb1ad-1e4d-4934-ad35-75089cd81171>: ('tcp://192.168.99.2:33237',),
 <Future: status: finished, type: DataFrame, key: head-ac27a29f-6016-481b-91bc-93eac5432f45>: ('tcp://192.168.99.2:42212',),
 <Future: status: finished, type: DataFrame, key: head-e43d1304-fcee-45f4-a26e-546fa306717c>: ('tcp://192.168.99.2:45330',),
 <Future: status: finished, type: DataFrame, key: head-75d2cc7f-c85d-4a90-8d9a-82b5f1bcba11>: ('tcp://192.168.99.2:40690',),


In [55]:
output[0]
print(output[0])

        loan_id  monthly_reporting_period  servicer  interest_rate  current_actual_upb  loan_age  remaining_months_to_legal_maturity ...  servicing_activity_indicator
0  100007365142   2000-01-01T00:00:00.000                      8.0                           0.0                               360.0 ...                              
1  100007365142   2001-01-01T00:00:00.000                      8.0             74319.0      12.0                               348.0 ...                              
2  100007365142   2002-01-01T00:00:00.000                      8.0            73635.48      24.0                               336.0 ...                              
3  100007365142   2003-01-01T00:00:00.000                      8.0   72795.40999999999      36.0                               324.0 ...                              
4  100007365142   2000-02-01T00:00:00.000                      8.0                           1.0                               359.0 ...                             

In [None]:
remote_df = client.scatter(df)

In [None]:
future = client.submit(my_function, remote_df)

#### Writing data

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

## Conclusion

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

To learn more about RAPIDS, be sure to check out: 

* [Open Source Website](http://rapids.ai)
* [GitHub](https://github.com/rapidsai/)
* [Press Release](https://nvidianews.nvidia.com/news/nvidia-introduces-rapids-open-source-gpu-acceleration-platform-for-large-scale-data-analytics-and-machine-learning)
* [NVIDIA Blog](https://blogs.nvidia.com/blog/2018/10/10/rapids-data-science-open-source-community/)
* [Developer Blog](https://devblogs.nvidia.com/gpu-accelerated-analytics-rapids/)
* [NVIDIA Data Science Webpage](https://www.nvidia.com/en-us/deep-learning-ai/solutions/data-science/)