# Myria Python & Jupyter

<img src="https://raw.githubusercontent.com/uwescience/myria-python/master/ipnb%20examples/overview.png" style="height: 300px"/>

### To install `Myria-Python`:

```
pip install myria-python
```

### Or:

```
git clone https://github.com/uwescience/myria-python
cd myria-python
sudo python setup.py install
```



## 1. Connecting to Myria

In [2]:
from myria import *
import numpy

# Load Myria extensions
%load_ext myria

# Create Jupyter Connection
%connect http://demo.myria.cs.washington.edu:8753

<myria.connection.MyriaConnection at 0x7f07d00a90d0>

In [3]:
# Alternatively, create an ordinary Python connection to the Myria demo cluster
connection = MyriaConnection(rest_url='http://demo.myria.cs.washington.edu:8753')
# Use this as the default connection
MyriaRelation.DefaultConnection = connection

## 2. Myria: Connections, Relations, and Queries (and Schemas and Plans)

In [4]:
# How many datasets are there on the server?
print len(connection.datasets())

11


In [5]:
# Let's look at the first dataset...
dataset = connection.datasets()[0]
print dataset['relationKey']['relationName']
print dataset['created']

InDegree
2016-07-04T09:30:18.897Z


In [7]:
# View data stored in this relation
MyriaRelation(dataset['relationKey'])

Unnamed: 0,a,sum_count_all
0,648,10
1,38,13
2,300,1
3,989,29
4,633,34
5,339,4
6,272,4
7,591,1
8,238,8
9,16,45


### Uploading data

In [19]:
%%query

-- Load from S3
florida = load("https://s3-us-west-2.amazonaws.com/myria-demo-data/fl_insurance_sample_2.csv",
csv(schema(
            id:int,
            geo:string,
            granularity:int,
            deductable:float,
            policyID:int, 
            construction:string,
            line:string,
            county:string,
            state:string,
            longitude:float,
            latitude:float,
            fl_site_deductible:float,
            hu_site_deductible:float,
            eq_site_deductible:float,
            tiv_2012:float,
            tiv_2011:float,
            fr_site_limit:float,
            fl_site_limit:float,
            hu_site_limit:float,
            eq_site_limit:float), skip=1));


clay_county = [from florida where county = 'CLAY COUNTY' emit *];

store(clay_county, insurance);

Unnamed: 0,construction,county,deductable,eq_site_deductible,eq_site_limit,fl_site_deductible,fl_site_limit,fr_site_limit,geo,granularity,hu_site_deductible,hu_site_limit,id,latitude,line,longitude,policyID,state,tiv_2011,tiv_2012
0,Wood,CLAY COUNTY,0,0.0,0.0,0,0.0,0.0,0101000020E6100000CFF753E3A57F54C00E7F4DD6A8CF...,1,0.0,63259.77,10388,29.81117,Residential,-81.9945,767514,FL,63259.77,50163.99
1,Masonry,CLAY COUNTY,0,0.0,498960.0,0,498960.0,498960.0,0101000020E6100000E0421EC18D6D54C000A8E2C62D1A...,1,9979.2,498960.0,1,30.102261,Residential,-81.711777,119736,FL,498960.0,792148.9
2,Masonry,CLAY COUNTY,0,0.0,1322376.3,0,1322376.3,1322376.3,0101000020E61000009D23F25D4A6D54C07D09151C5E10...,3,0.0,1322376.3,2,30.063936,Residential,-81.707664,448094,FL,1322376.3,1438163.57
3,Wood,CLAY COUNTY,0,0.0,190724.4,0,190724.4,190724.4,0101000020E610000076543541D46C54C08C683BA6EE16...,1,0.0,190724.4,3,30.089579,Residential,-81.700455,206893,FL,190724.4,192476.78
4,Wood,CLAY COUNTY,0,0.0,0.0,0,0.0,0.0,0101000020E6100000FC1186014B6D54C00AEE073C3010...,3,0.0,79520.76,4,30.063236,Residential,-81.707703,333743,FL,79520.76,86854.48
5,Wood,CLAY COUNTY,0,0.0,0.0,0,0.0,254281.5,0101000020E6100000DF2D90A0F86C54C003AE2B66840F...,1,0.0,254281.5,5,30.060614,Residential,-81.702675,172534,FL,254281.5,246144.49
6,Masonry,CLAY COUNTY,0,0.0,0.0,0,0.0,0.0,0101000020E6100000FC1186014B6D54C00AEE073C3010...,3,0.0,515035.62,6,30.063236,Residential,-81.707703,785275,FL,515035.62,884419.17
7,Reinforced Concrete,CLAY COUNTY,0,0.0,0.0,0,0.0,0.0,0101000020E6100000FFCC203EB06D54C008CDAE7B2B1A...,1,0.0,19260000.0,7,30.102226,Commercial,-81.713882,995932,FL,19260000.0,20610000.0
8,Wood,CLAY COUNTY,0,0.0,328500.0,0,328500.0,328500.0,0101000020E6100000D1DF4BE1416D54C06018B0E42A1A...,1,16425.0,328500.0,8,30.102217,Residential,-81.707146,223488,FL,328500.0,348374.25
9,Wood,CLAY COUNTY,0,0.0,315000.0,0,315000.0,315000.0,0101000020E610000088D51F61186D54C0769D0DF9671E...,1,15750.0,315000.0,9,30.118774,Residential,-81.704613,433512,FL,315000.0,265821.57


In [13]:
# Alternatively, you can upload directly from a Python string
name = {'userName': 'Brandon', 'programName': 'Demo', 'relationName': 'Books'}
schema = { "columnNames" : ["name", "pages"],
           "columnTypes" : ["STRING_TYPE","LONG_TYPE"] }

data = """Brave New World,288
Nineteen Eighty-Four,376
We,256"""

result = connection.upload_file(
    name, schema, data, delimiter=',', overwrite=True)

MyriaRelation(result['relationKey'], connection=connection)

Unnamed: 0,name,pages
0,Nineteen Eighty-Four,376
1,Brave New World,288
2,We,256


In [14]:
#Or, load using the myria_upload command-line utility
!wget https://s3-us-west-2.amazonaws.com/myria-demo-data/books.csv
!myria_upload --hostname demo.myria.cs.washington.edu --port 8753 --no-ssl --user Brandon --program Demo --relation Demo --overwrite books.csv

--2016-07-04 10:23:27--  https://s3-us-west-2.amazonaws.com/myria-demo-data/books.csv
Resolving s3-us-west-2.amazonaws.com (s3-us-west-2.amazonaws.com)... 54.231.184.18
Connecting to s3-us-west-2.amazonaws.com (s3-us-west-2.amazonaws.com)|54.231.184.18|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 70 [application/octet-stream]
Saving to: ‘books.csv.1’


2016-07-04 10:23:27 (676 KB/s) - ‘books.csv.1’ saved [70/70]

INFO:root:RelationKey = Brandon:Demo:Demo
INFO:root:Schema = [('STRING_TYPE', 'column0'), ('LONG_TYPE', 'column1')]
INFO:root:Myria schema: {"columnNames": ["column0", "column1"], "columnTypes": ["STRING_TYPE", "LONG_TYPE"]}
INFO:root:Creating a plaintext file
INFO:requests.packages.urllib3.connectionpool:Starting new HTTP connection (1): demo.myria.cs.washington.edu
{
    "created": "2016-07-04T17:23:27.738Z",
    "numTuples": 4,
    "uri": "http://demo.myria.cs.washington.edu:8753/dataset/user-Brandon/program-Demo/relation-Demo",
    "howPartitio

### Working with relations

In [20]:
# Using the previously-stored insurance relation
MyriaRelation("insurance")

Unnamed: 0,construction,county,deductable,eq_site_deductible,eq_site_limit,fl_site_deductible,fl_site_limit,fr_site_limit,geo,granularity,hu_site_deductible,hu_site_limit,id,latitude,line,longitude,policyID,state,tiv_2011,tiv_2012
0,Wood,CLAY COUNTY,0,0.0,0.0,0,0.0,0.0,0101000020E6100000CFF753E3A57F54C00E7F4DD6A8CF...,1,0.0,63259.77,10388,29.81117,Residential,-81.9945,767514,FL,63259.77,50163.99
1,Masonry,CLAY COUNTY,0,0.0,498960.0,0,498960.0,498960.0,0101000020E6100000E0421EC18D6D54C000A8E2C62D1A...,1,9979.2,498960.0,1,30.102261,Residential,-81.711777,119736,FL,498960.0,792148.9
2,Masonry,CLAY COUNTY,0,0.0,1322376.3,0,1322376.3,1322376.3,0101000020E61000009D23F25D4A6D54C07D09151C5E10...,3,0.0,1322376.3,2,30.063936,Residential,-81.707664,448094,FL,1322376.3,1438163.57
3,Wood,CLAY COUNTY,0,0.0,190724.4,0,190724.4,190724.4,0101000020E610000076543541D46C54C08C683BA6EE16...,1,0.0,190724.4,3,30.089579,Residential,-81.700455,206893,FL,190724.4,192476.78
4,Wood,CLAY COUNTY,0,0.0,0.0,0,0.0,0.0,0101000020E6100000FC1186014B6D54C00AEE073C3010...,3,0.0,79520.76,4,30.063236,Residential,-81.707703,333743,FL,79520.76,86854.48
5,Wood,CLAY COUNTY,0,0.0,0.0,0,0.0,254281.5,0101000020E6100000DF2D90A0F86C54C003AE2B66840F...,1,0.0,254281.5,5,30.060614,Residential,-81.702675,172534,FL,254281.5,246144.49
6,Masonry,CLAY COUNTY,0,0.0,0.0,0,0.0,0.0,0101000020E6100000FC1186014B6D54C00AEE073C3010...,3,0.0,515035.62,6,30.063236,Residential,-81.707703,785275,FL,515035.62,884419.17
7,Reinforced Concrete,CLAY COUNTY,0,0.0,0.0,0,0.0,0.0,0101000020E6100000FFCC203EB06D54C008CDAE7B2B1A...,1,0.0,19260000.0,7,30.102226,Commercial,-81.713882,995932,FL,19260000.0,20610000.0
8,Wood,CLAY COUNTY,0,0.0,328500.0,0,328500.0,328500.0,0101000020E6100000D1DF4BE1416D54C06018B0E42A1A...,1,16425.0,328500.0,8,30.102217,Residential,-81.707146,223488,FL,328500.0,348374.25
9,Wood,CLAY COUNTY,0,0.0,315000.0,0,315000.0,315000.0,0101000020E610000088D51F61186D54C0769D0DF9671E...,1,15750.0,315000.0,9,30.118774,Residential,-81.704613,433512,FL,315000.0,265821.57


In [21]:
# View details about this relation
relation = MyriaRelation("insurance")
print len(relation)
print relation.created_date
print relation.schema.names

363
2016-07-04 17:24:56.065000+00:00
[u'id', u'geo', u'granularity', u'deductable', u'policyID', u'construction', u'line', u'county', u'state', u'longitude', u'latitude', u'fl_site_deductible', u'hu_site_deductible', u'eq_site_deductible', u'tiv_2012', u'tiv_2011', u'fr_site_limit', u'fl_site_limit', u'hu_site_limit', u'eq_site_limit']


### Working Locally with Relations

In [22]:
# 1: Download as a Python dictionary
d = MyriaRelation("insurance").to_dict()
print 'First entry returned: %s' % d[0]['county']

First entry returned: CLAY COUNTY


In [23]:
# 2: Download as a Pandas DataFrame
df = MyriaRelation("insurance").to_dataframe()
print '%d entries with nonzero deductable' % len(df[df.eq_site_deductible > 0]) 

4 entries with nonzero deductable


In [26]:
# 3: Download as a DataFrame and convert to a numpy array
array = MyriaRelation("insurance").to_dataframe().as_matrix()
print 'Mean site limit = %d' % array[:,4].mean()

Mean site limit = 47226


## Working with queries

In [45]:
%%query --Embed MyriaL in Jupyter notebook by using the "%%query" prefix 

insurance = scan(insurance);

descriptives = [from insurance emit min(eq_site_deductible) as min_deductible, 
                                    max(eq_site_deductible) as max_deductible, 
                                    avg(eq_site_deductible) as mean_deductible, 
                                    stdev(eq_site_deductible) as stdev_deductible];

store(descriptives, descriptives);

Unnamed: 0,max_deductible,mean_deductible,min_deductible,stdev_deductible
0,14112,89.045455,0,989.204846


In [47]:
# Grab the results of the most recent execution
query = _
or_this_works_too = _45

In [48]:
query

Unnamed: 0,max_deductible,mean_deductible,min_deductible,stdev_deductible
0,14112,89.045455,0,989.204846


### Single-line queries may be treated like Python expressions

In [55]:
query = %datalog Just500(column0, 500) :- TwitterK(column0, 500)%
print query.status
query

SUCCESS


Unnamed: 0,_COLUMN1_,column0
0,500,499
1,500,498


## 5. Variable Binding

In [56]:
low, high, destination = 543, 550, 'BoundRelation'

The tokens `@low`, `@high`, and `@destination` are bound to their values:

In [57]:
%%query
T1 = scan(TwitterK);
T2 = [from T1 where $0 > @low and $0 < @high emit $1 as x];
store(T2, @destination);

Unnamed: 0,x
0,989
1,21
2,53
3,20
4,610
5,16


# Deploying Myria in an Amazon Cluster!

## 1. Installing the Myria CLI

```
# From the command line, execute:
sudo pip install myria-cluster
```

## 2. Launching Clusters

In [64]:
!myria-cluster create my-cluster


Your new Myria cluster 'my-cluster' has been launched on Amazon EC2 in the 'us-west-2' region.

View the Myria worker IDs and public hostnames of all nodes in this cluster (the coordinator has worker ID 0):
myria-cluster list my-cluster --region us-west-2

Stop this cluster:
myria-cluster stop my-cluster --region us-west-2

Start this cluster after stopping it:
myria-cluster start my-cluster --region us-west-2

Destroy this cluster:
myria-cluster destroy my-cluster --region us-west-2

Log into the coordinator node:
ssh -i /home/bhaynes/.ssh/bhaynes-myria_us-west-2.pem ubuntu@ec2-50-112-33-121.us-west-2.compute.amazonaws.com

myria-web interface:
http://ec2-50-112-33-121.us-west-2.compute.amazonaws.com:8080

MyriaX REST endpoint:
http://ec2-50-112-33-121.us-west-2.compute.amazonaws.com:8753

Ganglia web interface:
http://ec2-50-112-33-121.us-west-2.compute.amazonaws.com:8090

Jupyter notebook interface:
http://ec2-50-112-33-121.us-west-2.compute.amazonaws.co

## 3. Connecting to the Cluster via Python

You can connect to the new cluster by using the MyriaX REST endpoint URL.  In the example above, this is listed as http://ec2-50-112-33-121.us-west-2.compute.amazonaws.com:8753.

In [5]:
# Substitute your MyriaX REST URL here!
%connect http://ec2-52-1-38-182.compute-1.amazonaws.com:8753 

<myria.connection.MyriaConnection at 0x7f07c9614350>

# Where to find more information:

#### Documentation
[Myria Website](http://myria.cs.washington.edu/)<br /> 
[Myria Python](http://myria.cs.washington.edu/docs/myria-python/)<br /> 
[Additional Language Documentation](http://myria.cs.washington.edu/docs/myrial.html)<br /> 
[This Notebook](https://github.com/uwescience/myria-python/blob/master/ipnb%20examples/myria.ipynb) 

#### Repositories
[Myria](https://github.com/uwescience/myria)<br /> 
[Myria-Python](https://github.com/uwescience/myria-python)<br /> 
[Myria-Cluster](https://github.com/uwescience/myria-ec2-ansible)

#### Mailing List
[myria-users@cs.washington.edu](mailto:myria-users@cs.washington.edu)

## Jupyter
[Homepage](http://jupyter.org/)

## Pandas/Dataframes
[Homepage](http://pandas.pydata.org/)