### TuriCreate Crash Course
+ TuriCreate
  - made by Apple

#### Task
+ Basics
  - How to read csv files
  - How to create dataframes
  - Basic Data Manipulation
+ Text Classification with ML
  - Disaster classifier

#### Installation
+ pip install turicreate


#### Features
+ Easy-to-use: Focus on tasks instead of algorithms
+ Visual: Built-in, streaming visualizations to explore your data
+ Flexible: Supports text, images, audio, video and sensor data
+ Fast and Scalable: Work with large datasets on a single machine
+ Ready To Deploy: Export models to Core ML for use in iOS, macOS, watchOS, and tvOS apps

In [1]:
!pip install turicreate

Collecting turicreate
[?25l  Downloading https://files.pythonhosted.org/packages/25/9f/a76acc465d873d217f05eac4846bd73d640b9db6d6f4a3c29ad92650fbbe/turicreate-6.4.1-cp37-cp37m-manylinux1_x86_64.whl (92.0MB)
[K     |████████████████████████████████| 92.0MB 46kB/s 
Collecting coremltools==3.3
[?25l  Downloading https://files.pythonhosted.org/packages/1b/1d/b1a99beca7355b6a026ae61fd8d3d36136e5b36f13e92ec5f81aceffc7f1/coremltools-3.3-cp37-none-manylinux1_x86_64.whl (3.5MB)
[K     |████████████████████████████████| 3.5MB 31.5MB/s 
Collecting resampy==0.2.1
[?25l  Downloading https://files.pythonhosted.org/packages/14/b6/66a06d85474190b50aee1a6c09cdc95bb405ac47338b27e9b21409da1760/resampy-0.2.1.tar.gz (322kB)
[K     |████████████████████████████████| 327kB 56.6MB/s 
Collecting tensorflow<2.1.0,>=2.0.0
[?25l  Downloading https://files.pythonhosted.org/packages/3c/b3/3eeae9bc44039ceadceac0c7ba1cc8b1482b172810b3d7624a1cad251437/tensorflow-2.0.4-cp37-cp37m-manylinux2010_x86_64.whl (86.4MB

In [2]:
# Load Pkgs
import turicreate as tc

In [3]:
# Methods/Attrib
dir(tc)

['Edge',
 'Image',
 'SArray',
 'SArrayBuilder',
 'SFrame',
 'SFrameBuilder',
 'SGraph',
 'Sketch',
 'Vertex',
 '_',
 '__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__path__',
 '__spec__',
 '__version__',
 '_connect',
 '_cython',
 '_deps',
 '_extensions_wrapper',
 '_launch',
 '_scripts',
 '_sys',
 '_sys_util',
 'activity_classifier',
 'aggregate',
 'boosted_trees_classifier',
 'boosted_trees_regression',
 'classifier',
 'clustering',
 'config',
 'connected_components',
 'data_structures',
 'dbscan',
 'decision_tree_classifier',
 'decision_tree_regression',
 'degree_counting',
 'distances',
 'drawing_classifier',
 'evaluation',
 'extensions',
 'factorization_recommender',
 'graph_analytics',
 'graph_coloring',
 'image_analysis',
 'image_classifier',
 'image_similarity',
 'item_content_recommender',
 'item_similarity_recommender',
 'kcore',
 'kmeans',
 'label_propagation',
 'linear_regression',
 'load_audio',
 'load_images',
 'load_mo

In [None]:
# How to Create Dataframe
+ SFrame is a scalable, tabular, column-mutable dataframe object.

In [4]:
# Method 1: How to create a dataframe from dictionary
df1 = tc.SFrame({'text':['there was an earthquake last year']})

In [5]:
df1

text
there was an earthquake last year ...


In [6]:
# Method 2: From Pandas
import pandas as pd
df = pd.DataFrame({'age':[1,2,3,4,5]})

In [7]:
df

Unnamed: 0,age
0,1
1,2
2,3
3,4
4,5


In [8]:
df2 = tc.SFrame(data=df)

In [9]:
df2

age
1
2
3
4
5


In [10]:
print(type(df))
print(type(df2))

<class 'pandas.core.frame.DataFrame'>
<class 'turicreate.data_structures.sframe.SFrame'>


In [11]:
# Reading Data From CSV
df = tc.SFrame('disaster_tweets.csv')

------------------------------------------------------
Inferred types from first 100 line(s) of file as 
column_type_hints=[int,str,str,str,int]
If parsing fails due to incorrect types, you can correct
the inferred type list above and pass it to read_csv in
the column_type_hints argument
------------------------------------------------------


In [12]:
# Display/Preview
df.head()

id,keyword,location,text,target
1,,,Our Deeds are the Reason of this #earthquake May ...,1
4,,,Forest fire near La Ronge Sask. Canada ...,1
5,,,All residents asked to 'shelter in place' are ...,1
6,,,"13,000 people receive #wildfires evacuation ...",1
7,,,Just got sent this photo from Ruby #Alaska as ...,1
8,,,#RockyFire Update => California Hwy. 20 cl ...,1
10,,,#flood #disaster Heavy rain causes flash ...,1
13,,,I'm on top of the hill and I can see a fire in ...,1
14,,,There's an emergency evacuation happening now ...,1
15,,,I'm afraid that the tornado is coming to our ...,1


In [13]:
# Display 5 rows
df.head(5)

id,keyword,location,text,target
1,,,Our Deeds are the Reason of this #earthquake May ...,1
4,,,Forest fire near La Ronge Sask. Canada ...,1
5,,,All residents asked to 'shelter in place' are ...,1
6,,,"13,000 people receive #wildfires evacuation ...",1
7,,,Just got sent this photo from Ruby #Alaska as ...,1


In [14]:
# Check For Shape
df.shape

(7613, 5)

In [16]:
# Number of col
df.num_columns()

5

In [18]:
# Num of Rows
df.num_rows()

7613

In [19]:
# Check For COlumn Names
df.column_names()

['id', 'keyword', 'location', 'text', 'target']

In [20]:
# Datatype
# Method 1
df.column_types()

[int, str, str, str, int]

In [22]:
# Check For Datatypes
# Method 2
df.dtype

[int, str, str, str, int]

In [23]:
### Selection of Columns
df['text']

dtype: str
Rows: 7613
['Our Deeds are the Reason of this #earthquake May ALLAH Forgive us all', 'Forest fire near La Ronge Sask. Canada', "All residents asked to 'shelter in place' are being notified by officers. No other evacuation or shelter in place orders are expected", '13,000 people receive #wildfires evacuation orders in California ', 'Just got sent this photo from Ruby #Alaska as smoke from #wildfires pours into a school', '#RockyFire Update => California Hwy. 20 closed in both directions due to Lake County fire - #CAfire #wildfires', '#flood #disaster Heavy rain causes flash flooding of streets in Manitou, Colorado Springs areas', "I'm on top of the hill and I can see a fire in the woods...", "There's an emergency evacuation happening now in the building across the street", "I'm afraid that the tornado is coming to our area...", 'Three people died from the heat wave so far', 'Haha South Tampa is getting flooded hah- WAIT A SECOND I LIVE IN SOUTH TAMPA WHAT AM I GONNA DO WHAT A

In [24]:
### Selection of Only one Column
df.select_column('id')

dtype: int
Rows: 7613
[1, 4, 5, 6, 7, 8, 10, 13, 14, 15, 16, 17, 18, 19, 20, 23, 24, 25, 26, 28, 31, 32, 33, 34, 36, 37, 38, 39, 40, 41, 44, 48, 49, 50, 52, 53, 54, 55, 56, 57, 59, 61, 62, 63, 64, 65, 66, 67, 68, 71, 73, 74, 76, 77, 78, 79, 80, 81, 82, 83, 85, 86, 89, 91, 92, 93, 95, 96, 97, 98, 100, 102, 104, 105, 107, 109, 110, 112, 113, 114, 117, 118, 119, 120, 121, 126, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 141, 143, ... ]

In [25]:
### Selection of Multiple Columns
df[['text','target']]

text,target
Our Deeds are the Reason of this #earthquake May ...,1
Forest fire near La Ronge Sask. Canada ...,1
All residents asked to 'shelter in place' are ...,1
"13,000 people receive #wildfires evacuation ...",1
Just got sent this photo from Ruby #Alaska as ...,1
#RockyFire Update => California Hwy. 20 cl ...,1
#flood #disaster Heavy rain causes flash ...,1
I'm on top of the hill and I can see a fire in ...,1
There's an emergency evacuation happening now ...,1
I'm afraid that the tornado is coming to our ...,1


In [27]:
### Selection of Multiple Columns
df.select_columns(['text','target'])

text,target
Our Deeds are the Reason of this #earthquake May ...,1
Forest fire near La Ronge Sask. Canada ...,1
All residents asked to 'shelter in place' are ...,1
"13,000 people receive #wildfires evacuation ...",1
Just got sent this photo from Ruby #Alaska as ...,1
#RockyFire Update => California Hwy. 20 cl ...,1
#flood #disaster Heavy rain causes flash ...,1
I'm on top of the hill and I can see a fire in ...,1
There's an emergency evacuation happening now ...,1
I'm afraid that the tornado is coming to our ...,1


In [28]:
df

id,keyword,location,text,target
1,,,Our Deeds are the Reason of this #earthquake May ...,1
4,,,Forest fire near La Ronge Sask. Canada ...,1
5,,,All residents asked to 'shelter in place' are ...,1
6,,,"13,000 people receive #wildfires evacuation ...",1
7,,,Just got sent this photo from Ruby #Alaska as ...,1
8,,,#RockyFire Update => California Hwy. 20 cl ...,1
10,,,#flood #disaster Heavy rain causes flash ...,1
13,,,I'm on top of the hill and I can see a fire in ...,1
14,,,There's an emergency evacuation happening now ...,1
15,,,I'm afraid that the tornado is coming to our ...,1


In [29]:
# Plot of Value Counts
df['target'].plot()

<turicreate.visualization._plot.Plot at 0x7fb5bbc94610>

In [30]:
dir(df)

['_SFrame__dropna_errchk',
 '__add__',
 '__class__',
 '__copy__',
 '__deepcopy__',
 '__delattr__',
 '__delitem__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__get_column_description__',
 '__get_pretty_tables__',
 '__getattribute__',
 '__getitem__',
 '__gt__',
 '__has_size__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__is_materialized__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__nonzero__',
 '__proxy__',
 '__query_plan_str__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__setitem__',
 '__sizeof__',
 '__slots__',
 '__str__',
 '__str_impl__',
 '__subclasshook__',
 '_cache',
 '_imagecols_to_stringcols',
 '_infer_column_types_from_lines',
 '_proxy',
 '_read_csv_impl',
 '_repr_html_',
 '_row_selector',
 '_save_reference',
 'add_column',
 'add_columns',
 'add_row_number',
 'append',
 'apply',
 'column_names',
 'column_types',
 'copy',
 'drop_duplicates',
 'dropna',
 'dropna_split',
 'dtype',
 'ex

In [31]:
df['target'].unique()

dtype: int
Rows: 2
[0, 1]

In [39]:
# Apply a fxn
df['text'].apply(lambda x: str(x).lower())

dtype: str
Rows: 7613
['our deeds are the reason of this #earthquake may allah forgive us all', 'forest fire near la ronge sask. canada', "all residents asked to 'shelter in place' are being notified by officers. no other evacuation or shelter in place orders are expected", '13,000 people receive #wildfires evacuation orders in california ', 'just got sent this photo from ruby #alaska as smoke from #wildfires pours into a school', '#rockyfire update => california hwy. 20 closed in both directions due to lake county fire - #cafire #wildfires', '#flood #disaster heavy rain causes flash flooding of streets in manitou, colorado springs areas', "i'm on top of the hill and i can see a fire in the woods...", "there's an emergency evacuation happening now in the building across the street", "i'm afraid that the tornado is coming to our area...", 'three people died from the heat wave so far', 'haha south tampa is getting flooded hah- wait a second i live in south tampa what am i gonna do what a

In [40]:
# Show value counts of a column
df['target'].show()

In [41]:
# Remove/Delete Columns
df.remove_column('id')

keyword,location,text,target
,,Our Deeds are the Reason of this #earthquake May ...,1
,,Forest fire near La Ronge Sask. Canada ...,1
,,All residents asked to 'shelter in place' are ...,1
,,"13,000 people receive #wildfires evacuation ...",1
,,Just got sent this photo from Ruby #Alaska as ...,1
,,#RockyFire Update => California Hwy. 20 cl ...,1
,,#flood #disaster Heavy rain causes flash ...,1
,,I'm on top of the hill and I can see a fire in ...,1
,,There's an emergency evacuation happening now ...,1
,,I'm afraid that the tornado is coming to our ...,1


In [42]:
# Remove Multiple Columns
df.remove_columns(['keyword','location'],inplace=True)

id,text,target
1,Our Deeds are the Reason of this #earthquake May ...,1
4,Forest fire near La Ronge Sask. Canada ...,1
5,All residents asked to 'shelter in place' are ...,1
6,"13,000 people receive #wildfires evacuation ...",1
7,Just got sent this photo from Ruby #Alaska as ...,1
8,#RockyFire Update => California Hwy. 20 cl ...,1
10,#flood #disaster Heavy rain causes flash ...,1
13,I'm on top of the hill and I can see a fire in ...,1
14,There's an emergency evacuation happening now ...,1
15,I'm afraid that the tornado is coming to our ...,1


In [43]:
df

id,text,target
1,Our Deeds are the Reason of this #earthquake May ...,1
4,Forest fire near La Ronge Sask. Canada ...,1
5,All residents asked to 'shelter in place' are ...,1
6,"13,000 people receive #wildfires evacuation ...",1
7,Just got sent this photo from Ruby #Alaska as ...,1
8,#RockyFire Update => California Hwy. 20 cl ...,1
10,#flood #disaster Heavy rain causes flash ...,1
13,I'm on top of the hill and I can see a fire in ...,1
14,There's an emergency evacuation happening now ...,1
15,I'm afraid that the tornado is coming to our ...,1


In [44]:
# Save Dataset
df.save('disaster_data.csv')

In [45]:
df.save('disaster_data2.sframe')

In [46]:
# Reload data
df2 = tc.load_sframe('disaster_data2.sframe')

In [47]:
df2

id,text,target
1,Our Deeds are the Reason of this #earthquake May ...,1
4,Forest fire near La Ronge Sask. Canada ...,1
5,All residents asked to 'shelter in place' are ...,1
6,"13,000 people receive #wildfires evacuation ...",1
7,Just got sent this photo from Ruby #Alaska as ...,1
8,#RockyFire Update => California Hwy. 20 cl ...,1
10,#flood #disaster Heavy rain causes flash ...,1
13,I'm on top of the hill and I can see a fire in ...,1
14,There's an emergency evacuation happening now ...,1
15,I'm afraid that the tornado is coming to our ...,1
