# Fire up GraphLab Create

We always start with this line before using any part of GraphLab Create. It can take up to 30 seconds to load the GraphLab library - be patient!

The first time you use GraphLab create, you must enter a product key to license the software for non-commerical academic use. To register for a free one-year academic license and obtain your key, go to [dato.com](https://dato.com/download/academic.html).

In [1]:
import graphlab
# Set product key on this computer. After running this cell, you will not need to re-enter your product key. 
graphlab.product_key.set_product_key('08C2-9A7F-A4F6-F2EB-880A-76AB-C070-0FCB')

# Limit number of worker processes. This preserves system memory, which prevents hosted notebooks from crashing.
graphlab.set_runtime_config('GRAPHLAB_DEFAULT_NUM_PYLAMBDA_WORKERS', 4)

# Output active product key.
graphlab.product_key.get_product_key()

[INFO] graphlab.cython.cy_server: GraphLab Create v2.1 started. Logging: C:\Users\e150003\AppData\Local\Temp\graphlab_server_1519698946.log.0


This non-commercial license of GraphLab Create for academic use is assigned to E150003@e.ntu.edu.sg and will expire on February 26, 2019.


'08C2-9A7F-A4F6-F2EB-880A-76AB-C070-0FCB'

# Load a tabular data set

In [2]:
sf = graphlab.SFrame('people-example.csv')

------------------------------------------------------
Inferred types from first 100 line(s) of file as 
column_type_hints=[str,str,str,long]
If parsing fails due to incorrect types, you can correct
the inferred type list above and pass it to read_csv in
the column_type_hints argument
------------------------------------------------------


# SFrame basics

In [3]:
sf # we can view first few lines of table

First Name,Last Name,Country,age
Bob,Smith,United States,24
Alice,Williams,Canada,23
Malcolm,Jone,England,22
Felix,Brown,USA,23
Alex,Cooper,Poland,23
Tod,Campbell,United States,22
Derek,Ward,Switzerland,25


In [4]:
sf.tail()  # view end of the table

First Name,Last Name,Country,age
Bob,Smith,United States,24
Alice,Williams,Canada,23
Malcolm,Jone,England,22
Felix,Brown,USA,23
Alex,Cooper,Poland,23
Tod,Campbell,United States,22
Derek,Ward,Switzerland,25


# GraphLab Canvas

In [5]:
sf.show()

Canvas is accessible via web browser at the URL: http://localhost:13562/index.html
Opening Canvas in default web browser.


In [25]:
sf.show()

In [6]:
graphlab.canvas.set_target('ipynb')

In [7]:
sf['age'].show(view='Categorical')

# Inspect columns of dataset

In [8]:
sf['Country']

dtype: str
Rows: 7
['United States', 'Canada', 'England', 'USA', 'Poland', 'United States', 'Switzerland']

In [9]:
sf['age']

dtype: int
Rows: 7
[24L, 23L, 22L, 23L, 23L, 22L, 25L]

Some simple columnar operations

In [10]:
sf['age'].mean()

23.142857142857146

In [11]:
sf['age'].max()

25L

# Create new columns in our SFrame

In [12]:
sf

First Name,Last Name,Country,age
Bob,Smith,United States,24
Alice,Williams,Canada,23
Malcolm,Jone,England,22
Felix,Brown,USA,23
Alex,Cooper,Poland,23
Tod,Campbell,United States,22
Derek,Ward,Switzerland,25


In [13]:
sf['Full Name'] = sf['First Name'] + ' ' + sf['Last Name']

In [14]:
sf

First Name,Last Name,Country,age,Full Name
Bob,Smith,United States,24,Bob Smith
Alice,Williams,Canada,23,Alice Williams
Malcolm,Jone,England,22,Malcolm Jone
Felix,Brown,USA,23,Felix Brown
Alex,Cooper,Poland,23,Alex Cooper
Tod,Campbell,United States,22,Tod Campbell
Derek,Ward,Switzerland,25,Derek Ward


In [15]:
sf['age'] * sf['age']

dtype: int
Rows: 7
[576L, 529L, 484L, 529L, 529L, 484L, 625L]

# Use the apply function to do a advance transformation of our data

In [16]:
sf['Country']

dtype: str
Rows: 7
['United States', 'Canada', 'England', 'USA', 'Poland', 'United States', 'Switzerland']

In [17]:
sf['Country'].show()

In [18]:
def transform_country(country):
    if country == 'USA':
        return 'United States'
    else:
        return country

In [19]:
transform_country('Brazil')

'Brazil'

In [20]:
transform_country('Brasil')

'Brasil'

In [21]:
transform_country('USA')

'United States'

In [22]:
sf['Country'].apply(transform_country)

dtype: str
Rows: 7
['United States', 'Canada', 'England', 'United States', 'Poland', 'United States', 'Switzerland']

In [23]:
sf['Country'] = sf['Country'].apply(transform_country)

In [24]:
sf

First Name,Last Name,Country,age,Full Name
Bob,Smith,United States,24,Bob Smith
Alice,Williams,Canada,23,Alice Williams
Malcolm,Jone,England,22,Malcolm Jone
Felix,Brown,United States,23,Felix Brown
Alex,Cooper,Poland,23,Alex Cooper
Tod,Campbell,United States,22,Tod Campbell
Derek,Ward,Switzerland,25,Derek Ward
