# Classify text using fast.ai

This notebook will walk you through a simple example that trains a model to determine if there's a bicycle in an image and then use that to find bicycles in a video.

This work is based on the early lessons in [Practical Deep Learning for Coders](https://course.fast.ai/), taught online by Jeremy Howard. I **highly** recommend this free online course.

## Using this notebook

Essentially you need a computer that's running a GPU running fast.ai. There are a few ways to do this without owning a computer with a GPU (I certainly don't). There are [lots of options](https://course.fast.ai/index.html). I like to use use [the Amazon EC2 setup](https://course.fast.ai/start_aws.html), which is probably the most complicated. In most of these cases, you'll just clone [the workshop repository](https://github.com/Quartz/aistudio-workshops) and get the notebook running.

I'm also tailoring this notebook for use with [Google Colaboratory](https://colab.research.google.com), which as of this writing is the fastest, cheapest (free) way to get going.


### If you're using Google Colaboratory ...

Be aware that Google Colab instances are ephemeral -- they vanish *Poof* when you close them, or after a period of sitting idle (currently 90 minutes).

There are great steps on the fast.ai site for [getting started with fast.ai an Google Colab](https://course.fast.ai/start_colab.html). 

Those instructions will show you how to save your own copy of this _notebook_ to Google Drive.

They also tell you how to save a copy of your _data_ to Google Drive (Step 4), which is unneccesary for this workshop. 

In [None]:
## ALL GOOGLE COLAB USERS RUN THIS CELL

## This runs a script that installs fast.ai
!curl -s https://course.fast.ai/setup/colab | bash

### If you are _not_ using Google Colaboratory ...

Run the cell below.

In [6]:
## NON-COLABORATORY USERS SHOULD RUN THIS CELL
%reload_ext autoreload
%autoreload 2
%matplotlib inline

### Everybody do this ...

In [7]:
## AND *EVERYBODY* SHOULD RUN THIS CELL

from fastai.text import *

## The Plan

Given a set of political Facebook ads, we want to sort them into three categories: fundraising, list-building, and persuasion.

We're going to take a hand-coded set of 1,700 ads (which Jeremy B. Merrill coded on a long flight), and apply them to the larger Facebook ad database. As of this writing, that database has nearly 165,000 ads and clocks in about 3.2 GB. So for this class, as a proof of concept, we'll take a slice 5,000 ads.

Our plan will be:

- Download an English-language recognition **language model** pre-trained on Wikipedia articles
- Further train that **language model** on the type of English we're working with, specifically the corpus of Facebook ads we have
- Train a **classification model** on the difference between fundraising, list-building, and persuasion ads.
- Use that **classification model** model to label the bigger group of ads

## The Data

Let's get the two data sets we'll be using: The hand-labeled set of 1,700 ads and the raw set of 5,000 ads.

In [1]:
!wget -N https://qz-aistudio-public.s3.amazonaws.com/workshops/facebook_ad_data.zip
!unzip facebook_ad_data.zip > /dev/null
print('Done!')

--2019-08-16 18:58:15--  https://qz-aistudio-public.s3.amazonaws.com/workshops/facebook_ad_data.zip
Resolving qz-aistudio-public.s3.amazonaws.com (qz-aistudio-public.s3.amazonaws.com)... 52.216.206.211
Connecting to qz-aistudio-public.s3.amazonaws.com (qz-aistudio-public.s3.amazonaws.com)|52.216.206.211|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 7571822 (7.2M) [application/zip]
Saving to: ‘facebook_ad_data.zip’


2019-08-16 18:58:16 (108 MB/s) - ‘facebook_ad_data.zip’ saved [7571822/7571822]

Done!


Now you have a subdirectory called `facebook_ad_data` which contains two files.

In [4]:
%ls facebook_ad_data

ads_and_categories.csv  fbpac-ads-en-US-slice.csv


Next we'll load the `ads_and_categories.csv` file into a structure called a "data frame," which is a common way to handle large amounts of data in python.

In [9]:
hand_coded_ads = pd.read_csv('facebook_ad_data/ads_and_categories.csv')

Let's take a peek!

In [11]:
hand_coded_ads.head()

Unnamed: 0,text,label
0,National Trust for Historic Preservation Sp S ...,FUNDRAISING
1,Jan Schneider Sponsored ⋅ Paid for by Friend...,FUNDRAISING
2,Planned Parenthood Action Sponsored ⋅ Paid fo...,FUNDRAISING
3,Suggested Post Josh Harder Sponsored ⋅ Paid f...,FUNDRAISING
4,Mayor Philip Levine Sponsored ⋅ Paid for by ...,FUNDRAISING


And let's load in the raw ads.

In [12]:
raw_ads = pd.read_csv('facebook_ad_data/fbpac-ads-en-US-slice.csv')

In [13]:
raw_ads.head()

Unnamed: 0,6102213074570,"<div class=""_5pcr userContentWrapper""><div class=""_1dwg _1w_m _q7o""><div class=""_4r_y"" id=""u_fetchstream_16_k""><div class=""_6a uiPopover _5pbi _cmw _b1e _1wbl"" id=""u_fetchstream_16_l""><a class=""_4xev _p"" href=""nullblank"" id=""u_fetchstream_16_n""></a></div></div><div><div class=""y_iap92bd8_ q_iap92heuk clearfix""><div class=""clearfix u_iap92bbio""><a class=""_5pb8 i_iap92heur _8o _8s lfloat _ohe"" data-hovercard=""https://www.facebook.com/38471053686"" href=""https://www.facebook.com/ElizabethWarren/""><div class=""_38vo""><img class=""_s0 _4ooo _5xib _5sq7 _44ma _rw img"" src=""https://pp-facebook-ads.s3.amazonaws.com/v/t1.0-1/p80x80/34441200_10155774706728687_8710237734062522368_n.png""></div></a><div class=""clearfix _42ef""><div class=""rfloat _ohf""></div><div class=""f_iap92bbiv""><div><div class=""_6a _5u5j""><div class=""_6a _6b""></div><div class=""_6a _5u5j _6b""><h5 class=""_14f3 _14f5 _5pbw _5vra"" id=""js_3qk""><span class=""fwn fcg""><span class=""fwb fcg""><a data-hovercard=""https://www.facebook.com/38471053686"" href=""https://www.facebook.com/ElizabethWarren/"">Elizabeth Warren</a></span></span></h5><div><a class=""_5pcq"" href=""nullblank""><span><span class=""_3nlk"">Sponsored</span> ⋅ Paid for by <span class=""_3nlk"">Elizabeth for MA</span></span></a><span> · </span><a class=""uiStreamPrivacy inlineBlock fbStreamPrivacy fbPrivacyAudienceIndicator _5pcq"" href=""nullblank""><i class=""lock img sp_ks9jMipqQdl_1_5x sx_2780ca""></i></a></div></div></div></div></div></div></div></div><div class=""_5pbx userContent _3576"" id=""js_3ql""><p>We made these stickers to celebrate people like you who have refused to back down as Trump and the GOP deliver one gut-punch after another to the working people across the country. Tell us where to send your FREE limited-edition PERSIST sticker and we'll put it right in the mail.</p></div><div class=""_3x-2""><div><div class=""mtm""><div class=""_1ci8""><div class=""_6m2 _1zpr clearfix _dcs _4_w4 _41u- _59ap _2bf7 _64lx _3eqz _20pq _3eqw _2rk1 _3n1j"" id=""u_fetchstream_16_m""><div class=""clearfix _2r3x""><div class=""lfloat _ohe""><span class=""_3m6-""><div class=""_6ks""><a href=""http://5061.xg4ken.com/media/redir.php""><div class=""_6l- __c_""><div class=""uiScaledImageContainer _6m5 fbStoryAttachmentImage""><img class=""scaledImageFitWidth img"" src=""https://pp-facebook-ads.s3.amazonaws.com/v/t45.1600-4/cp0/q90/c0.0.1200.627/s552x414/35451853_6102212225370_2445941908946550784_n.png.jpg""></div></div></a></div><div class=""_3ekx _29_4""><div class=""_44ae _651x""><div class=""_6m3 _--6""><div class=""_59tj _2iau""><div><div class=""_6lz _6mb _1t62 ellipsis"">elizabethwarren.com</div><div class=""""></div></div></div><div class=""_3n1k""><div class=""mbs _6m6 _2cnj _5s6c""><a href=""http://5061.xg4ken.com/media/redir.php"">Get your free Persist sticker now.</a></div><div class=""_6m7 _3bt9"">These limited-edition stickers are fresh from the printer and you can bet they won't last long.</div></div></div><div class=""_44af _2e6-""><div class=""_522u""><div class=""_6a""><a class=""_42ft _4jy0 _4jy4 _517h _51sy"" href=""http://5061.xg4ken.com/media/redir.php"">Learn More</a></div></div></div></div><a class=""_52c6"" href=""http://5061.xg4ken.com/media/redir.php""></a></div></span></div><div class=""_42ef""><span class=""_3c21""></span></div></div></div><div class=""_4hk3""><div class=""_34js _1kaa _34jv"" id=""u_fetchstream_16_o""><div class=""_34jx _2cpc _34ju""><div class=""_34k0""><i class=""_34k2""></i></div><div class=""_34k3"">Paid for by Elizabeth for MA</div><a class=""_34k6"" href=""nullblank"" id=""u_fetchstream_16_p""></a></div><div class=""_34jw""></div></div><div class=""_1dp8""></div></div></div></div></div></div><div></div></div></div><div></div></div>",3,0,Elizabeth Warren,<p>We made these stickers to celebrate people like you who have refused to back down as Trump and the GOP deliver one gut-punch after another to the working people across the country. Tell us where to send your FREE limited-edition PERSIST sticker and we'll put it right in the mail.</p>,https://pp-facebook-ads.s3.amazonaws.com/v/t1.0-1/p80x80/34441200_10155774706728687_8710237734062522368_n.png,2018-06-21 20:08:42.35098+00,2019-01-04 14:44:12.123412+00,en-US,...,f,"[{""target"": ""Age"", ""segment"": ""34 to 49""}, {""target"": ""MinAge"", ""segment"": ""34""}, {""target"": ""MaxAge"", ""segment"": ""49""}, {""target"": ""Region"", ""segment"": ""the United States""}, {""target"": ""Gender"", ""segment"": ""women""}, {""target"": ""List""}]",Elizabeth Warren.1,"[{""entity"": ""FREE"", ""entity_type"": ""Organization""}, {""entity"": ""GOP"", ""entity_type"": ""Organization""}, {""entity"": ""PERSIST"", ""entity_type"": ""Organization""}, {""entity"": ""Trump and"", ""entity_type"": ""Organization""}]",https://www.facebook.com/ElizabethWarren/,https://www.facebook.com/elizabethwarren/,"{""<div><div class=\""_4-i0 _26c5\""><div class=\""clearfix\""><div class=\""_51-u rfloat _ohf\""><a class=\""_42ft _5upp _50zy layerCancel _51-t _50-0 _50z-\"" href=\""nullblank\"">Close</a></div><div><h3 id=\""u_5x_0\"" class=\""_52c9\"">About this Facebook ad</h3></div></div></div><div class=\""_4-i2 _pig _4s3a _50f4\""><div class=\""_4uov\""><div class=\""_4uoz\""><div id=\""u_5x_1\""></div><div class=\""_3-8x\""><span class=\""_4v6n\""><div id=\""u_5x_2\"">One reason you're seeing this ad is that <b id=\""ad_prefs_advertiser\"">Elizabeth Warren</b> added you to a list of people they want to reach on Facebook. They were able to reach you because <b id=\""ad_prefs_data_file\"">you're on a customer list collected by Elizabeth Warren or its partners,</b> or you've provided them with your contact information off of Facebook.</div></span><div class=\""_4hcd\""><span>There may be other reasons why you're seeing this ad, including that Elizabeth Warren wants to reach <b>women aged 34 to 49 who live in the United States</b>. This is information based on your Facebook profile and where you've connected to the Internet.</span></div></div></div><div></div><div class=\""_4uor _52jw\""><div class=\""_5aj7\"" id=\""u_5x_3\""><div class=\""_4bl9\""></div><div class=\""_4bl7 _5r7v\""><i class=\""img sp_KkPK2i2wr3h_1_5x sx_2e7d05\""><u>gear</u></i></div><div class=\""_4bl7\""><a href=\""https://www.facebook.com/ads/preferences/\"">Manage Your ad preferences</a></div></div></div><div class=\""mhs _1ray _1ra- _4uos\""></div><div class=\""_4uou\""><div class=\""_26c6 _3es4 fsl fwb fcb\"">Tell us what you think</div><div><div class=\""clearfix _ikh\"" id=\""u_5x_4\""><div class=\""_4bl7 _3es3\"">Was this explanation useful?</div><div class=\""_4bl7 _2pit\""><a href=\""nullblank\"" id=\""u_5x_5\"">Yes</a></div><div class=\""_4bl9 _2pit\""><a href=\""nullblank\"" id=\""u_5x_6\"">No</a></div></div><span class=\""hidden_elem _3es3\"" id=\""u_5x_7\"">Thanks for your rating.</span></div></div></div></div><div class=\""_5lnf uiOverlayFooter _5a8u _4866\""><table class=\""uiGrid _51mz uiOverlayFooterGrid\""><tbody><tr class=\""_51mx\""><td class=\""_51m- prs uiOverlayFooterMessage\""><a><i class=\""_4868 img sp_lWDeTZIqYK9_1_5x sx_5cccfd\""><u>info</u></i><span class=\""_4867\"">Learn more about Facebook Ads</span></a></td><td class=\""_51m- uiOverlayFooterButtons _51mw\""></td></tr></tbody></table></div></div>"",""<div><div class=\""_4-i0 _26c5\""><div class=\""clearfix\""><div class=\""_51-u rfloat _ohf\""><a class=\""_42ft _5upp _50zy layerCancel _51-t _50-0 _50z-\"" href=\""nullblank\"">Close</a></div><div><h3 id=\""u_5x_0\"" class=\""_52c9\"">About this Facebook ad</h3></div></div></div><div class=\""_4-i2 _pig _4s3a _50f4\""><div class=\""_4uov\""><div class=\""_4uoz\""><div id=\""u_5x_1\""></div><div class=\""_3-8x\""><span class=\""_4v6n\""><div id=\""u_5x_2\"">One reason you're seeing this ad is that <b id=\""ad_prefs_advertiser\"">Elizabeth Warren</b> added you to a list of people they want to reach on Facebook. They were able to reach you because <b id=\""ad_prefs_data_file\"">you're on a customer list collected by Elizabeth Warren or its partners,</b> or you've provided them with your contact information off of Facebook.</div></span><div class=\""_4hcd\""><span>There may be other reasons why you're seeing this ad, including that Elizabeth Warren wants to reach <b>women aged 34 to 49 who live in the United States</b>. This is information based on your Facebook profile and where you've connected to the Internet.</span></div></div></div><div></div><div class=\""_4uor _52jw\""><div class=\""_5aj7\"" id=\""u_5x_3\""><div class=\""_4bl9\""></div><div class=\""_4bl7 _5r7v\""><i class=\""img sp_KkPK2i2wr3h_1_5x sx_2e7d05\""><u>gear</u></i></div><div class=\""_4bl7\""><a href=\""https://www.facebook.com/ads/preferences/\"">Manage Your ad preferences</a></div></div></div><div class=\""mhs _1ray _1ra- _4uos\""></div><div class=\""_4uou\""><div class=\""_26c6 _3es4 fsl fwb fcb\"">Tell us what you think</div><div><div class=\""clearfix _ikh\"" id=\""u_5x_4\""><div class=\""_4bl7 _3es3\"">Was this explanation useful?</div><div class=\""_4bl7 _2pit\""><a href=\""nullblank\"" id=\""u_5x_5\"">Yes</a></div><div class=\""_4bl9 _2pit\""><a href=\""nullblank\"" id=\""u_5x_6\"">No</a></div></div><span class=\""hidden_elem _3es3\"" id=\""u_5x_7\"">Thanks for your rating.</span></div></div></div></div><div class=\""_5lnf uiOverlayFooter _5a8u _4866\""><table class=\""uiGrid _51mz uiOverlayFooterGrid\""><tbody><tr class=\""_51mx\""><td class=\""_51m- prs uiOverlayFooterMessage\""><a><i class=\""_4868 img sp_lWDeTZIqYK9_1_5x sx_5cccfd\""><u>info</u></i><span class=\""_4867\"">Learn more about Facebook Ads</span></a></td><td class=\""_51m- uiOverlayFooterButtons _51mw\""></td></tr></tbody></table></div></div>"",""<div><div class=\""_4-i0 _26c5\""><div class=\""clearfix\""><div class=\""_51-u rfloat _ohf\""><a class=\""_42ft _5upp _50zy layerCancel _51-t _50-0 _50z-\"" href=\""nullblank\"">Close</a></div><div><h3 id=\""u_5x_0\"" class=\""_52c9\"">About this Facebook ad</h3></div></div></div><div class=\""_4-i2 _pig _4s3a _50f4\""><div class=\""_4uov\""><div class=\""_4uoz\""><div id=\""u_5x_1\""></div><div class=\""_3-8x\""><span class=\""_4v6n\""><div id=\""u_5x_2\"">One reason you're seeing this ad is that <b id=\""ad_prefs_advertiser\"">Elizabeth Warren</b> added you to a list of people they want to reach on Facebook. They were able to reach you because <b id=\""ad_prefs_data_file\"">you're on a customer list collected by Elizabeth Warren or its partners,</b> or you've provided them with your contact information off of Facebook.</div></span><div class=\""_4hcd\""><span>There may be other reasons why you're seeing this ad, including that Elizabeth Warren wants to reach <b>women aged 34 to 49 who live in the United States</b>. This is information based on your Facebook profile and where you've connected to the Internet.</span></div></div></div><div></div><div class=\""_4uor _52jw\""><div class=\""_5aj7\"" id=\""u_5x_3\""><div class=\""_4bl9\""></div><div class=\""_4bl7 _5r7v\""><i class=\""img sp_KkPK2i2wr3h_1_5x sx_2e7d05\""><u>gear</u></i></div><div class=\""_4bl7\""><a href=\""https://www.facebook.com/ads/preferences/\"">Manage Your ad preferences</a></div></div></div><div class=\""mhs _1ray _1ra- _4uos\""></div><div class=\""_4uou\""><div class=\""_26c6 _3es4 fsl fwb fcb\"">Tell us what you think</div><div><div class=\""clearfix _ikh\"" id=\""u_5x_4\""><div class=\""_4bl7 _3es3\"">Was this explanation useful?</div><div class=\""_4bl7 _2pit\""><a href=\""nullblank\"" id=\""u_5x_5\"">Yes</a></div><div class=\""_4bl9 _2pit\""><a href=\""nullblank\"" id=\""u_5x_6\"">No</a></div></div><span class=\""hidden_elem _3es3\"" id=\""u_5x_7\"">Thanks for your rating.</span></div></div></div></div><div class=\""_5lnf uiOverlayFooter _5a8u _4866\""><table class=\""uiGrid _51mz uiOverlayFooterGrid\""><tbody><tr class=\""_51mx\""><td class=\""_51m- prs uiOverlayFooterMessage\""><a><i class=\""_4868 img sp_lWDeTZIqYK9_1_5x sx_5cccfd\""><u>info</u></i><span class=\""_4867\"">Learn more about Facebook Ads</span></a></td><td class=\""_51m- uiOverlayFooterButtons _51mw\""></td></tr></tbody></table></div></div>""}",Unnamed: 21,9,0.309519095346332
0,23842898110640273,"<div class=""_5pa- userContentWrapper""><div cla...",3,0,Shri Thanedar,"<p>Vote for a Stronger, Smarter Michigan by Vo...",https://pp-facebook-ads.s3.amazonaws.com/v/t1....,2018-06-21 19:40:52.638102+00,2019-01-04 14:44:43.791667+00,en-US,...,f,"[{""target"": ""Age"", ""segment"": ""24 and older""},...",Shri Thanedar,"[{""entity"": ""Voting"", ""entity_type"": ""Organiza...",https://www.facebook.com/ShriForMI/,https://www.facebook.com/shriformi/,"{""<div><div class=\""_4-i0 _26c5\""><div class=\...",,4.0,0.245959
1,6092893255247,"<div class=""_5pcr userContentWrapper""><div cla...",1,3,CARE,<p>No food. No water. No home. There are nearl...,https://pp-facebook-ads.s3.amazonaws.com/v/t1....,2018-05-22 20:00:11.678637+00,2018-05-23 18:27:58.596213+00,en-US,...,f,"[{""target"": ""Age"", ""segment"": ""35 and older""},...",CARE,[],https://www.facebook.com/carefans/,https://www.facebook.com/carefans/,"{""<div><div class=\""_4-i0 _26c5\""><div class=\...",,4.0,0.999972
2,23842836266000723,"<div class=""_5pcr userContentWrapper""><div cla...",3,0,MJ for Texas,"<p>As a veteran of the U.S. Air Force, a mothe...",https://pp-facebook-ads.s3.amazonaws.com/v/t1....,2018-06-21 20:06:11.500505+00,2019-01-04 14:44:20.460757+00,en-US,...,f,"[{""target"": ""Age"", ""segment"": ""18 and older""},...",MJ for Texas,"[{""entity"": ""Congress"", ""entity_type"": ""Organi...",https://www.facebook.com/MJforTexas/,https://www.facebook.com/mjfortexas/,"{""<div><div class=\""_4-i0 _26c5\""><div class=\...",,2.0,1.0
3,23843564908750215,"<div class=""_5pcr userContentWrapper""><div cla...",0,0,"AARP Medicare Supplement Plans, insured by Uni...",<p>Like this page for quick and healthy holida...,https://pp-facebook-ads.s3.amazonaws.com/v/t1....,2018-12-29 19:01:53.327363+00,2018-12-29 19:01:53.327363+00,en-US,...,f,"[{""target"": ""Interest"", ""segment"": ""AARP""}, {""...","AARP Medicare Supplement Plans, insured by Uni...",[],https://www.facebook.com/AARPMedicareSupplement/,https://www.facebook.com/aarpmedicaresupplement/,"{""<div><div class=\""_4-i0 _26c5\""><div class=\...",UnitedHealthcare,5.0,
4,23842858437190544,"<div class=""_5pcr userContentWrapper""><div cla...",3,0,International Rescue Committee,<p>URGENT: President Trump’s executive order d...,https://pp-facebook-ads.s3.amazonaws.com/v/t1....,2018-06-21 20:06:11.46361+00,2019-01-04 14:44:22.044652+00,en-US,...,f,"[{""target"": ""Age"", ""segment"": ""18 and older""},...",International Rescue Committee,"[{""entity"": ""Donald Trump"", ""entity_type"": ""Pe...",https://www.facebook.com/InternationalRescueCo...,https://www.facebook.com/internationalrescueco...,"{""<div><div class=\""_4-i0 _26c5\""><div class=\...",,2.0,0.999903


Let's take a looks at one ...

In [None]:
Show(filename='bikes_data/images/1/IMG_1494.JPG')

Notebooks aren't great at playing videos, so I posted `bikes_data/intersection_movie.mov` on [Vimeo](https://vimeo.com/354069170).

Now we need to load our image data in a format that's ready for the training code. We do that with fast.ai's data block.

In [None]:
data_path = Path('./bikes_data/images') ## The path for our data

np.random.seed(12)

data = (ImageList.from_folder(data_path)  #Where to find the data? -> in path and its subfolders
        .split_by_rand_pct()  #How to split in train/valid? -> do it *randomly* (Not by folder)
        .label_from_folder()         #How to label? -> depending on the folder of the filenames
        .transform(get_transforms(), size=(224,224))  #Data transforms applied, size of images shrink to 224
        .databunch(bs=48))

In [None]:
data_path.ls()

In [None]:
data.show_batch(rows=3)

In [None]:
# Let's be sure to check our classes
print(data.classes)

## Training

Now we will start training our model. We will use a convolutional neural network backbone and a fully connected head with a single hidden layer as a classifier. Don't know what these things mean? Most people don't! For a deeper dive, check out the fast.ai courses.

But for now, you need to know that we are building a model which will take images as input and will output the predicted probability for each of the categories: 0 and 1

### Transfer learning with resnet34

Training a computer-vision mode from scratch to solve our problem would take thousands of images. Maybe more. Instead we take advantage of an existing model that was trained to detect objects -- from planes, to cars, to dogs, to birds -- by processing millions of images. This model is called "resnet34."

With fast.ai we can infuse this model with our images (and their labels). This takes advantage of all of resnet34's "knowledge" of image-detection and tacks on our particular problem. This technique is called "transfer learning."

First we load our `data` and `model.resnet34` together into a training model known as a "learner."

In [None]:
learn = cnn_learner(data, models.resnet34, metrics=error_rate)

We will train for 6 epochs (6 cycles through all our data).

In [None]:
learn.fit_one_cycle(6)

#### How are we doing?

So far, we have a pretty good error rate. It's actually possible to do even better, but we'll stick with this for now.

We can take a look to see where the model was most confused, and whether what the model predicted was reasonable or not.

In [None]:
interp = ClassificationInterpretation.from_learner(learn)
losses,idxs = interp.top_losses()
len(data.valid_ds)==len(losses)==len(idxs)

In [None]:
interp.plot_top_losses(9, heatmap=True)

We can also see the situations in which it was most confused:

In [None]:
interp.plot_confusion_matrix(figsize=(4,4), dpi=60)

Let's save what we have in case we mess it up later!

In [None]:
learn.save('bikes-1')

In [None]:
Show(filename='bikes_data/never_seen_image.jpg', width=640)

In [None]:
img = open_image('bikes_data/never_seen_image.jpg')

In [None]:
learn.predict(img)

In [None]:
pred_class,pred_idx,outputs = learn.predict(img)

In [None]:
pred_class

In [None]:
int(pred_class)

In [None]:
outputs

In [None]:
outputs[1]

We're also going to _export_ the entire package as a "pickle" file called `export.pkl`

**Warning for Google Colab users!** Later, when you've done more fine tuning and want to save what you've done, you need to give permission for this notebook to write files to your Google Drive. In that case, you'll want to run the next cell and follow the permission-granting steps. For now, you can skip this.

In [None]:
## THIS CELL WILL ALLOW GOOGLE COLAB USERS SAVE MODELS TO YOUR GOOGLE DRIVE

from google.colab import drive
drive.mount('/content/gdrive', force_remount=True)
root_dir = "/content/gdrive/My Drive/"
save_path = Path(root_dir + 'ai-bikes/')
save_path.mkdir(parents=True, exist_ok=True)

In [None]:
learn.export("export.pkl")

## Search our video

Now we'll apply our model to our video! First we need to turn the video into a bunch of images using `ffmpeg`, which we loaded at the beginning of this notebook.

In [None]:
!ffmpeg -i bikes_data/intersection_movie.mov -vf fps=1 -vsync 0 myframe%04d.jpg

In [None]:
%ls

In [None]:
glob.glob('myframe*.*')

In [None]:
# learn = load_learner(path)  # this gets the exported pickle file, which is stored in the image data path

In [None]:
file_list = sorted(glob.glob('myframe*.*'))

for file in file_list:
    image = open_image(file)
    pred_class,pred_idx,outputs = learn.predict(image)
        
    if int(pred_class) == 1 and outputs[1] > 0.85:
        print(f'Bike detected in {file} with confidence {outputs[1]}')
    

In [None]:
Show(filename='myframe0025.jpg')