<a href="https://colab.research.google.com/github/tproffen/ORCSGirlsPython/blob/master/DoodleMining/QuickDrawDemos.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<img src="https://github.com/tproffen/ORCSGirlsPython/blob/master/Images/Logo.png?raw=1" width="10%" align="right" hpsace="50">

# Doodle Data Mining

## Demonstration of using the Quickdraw dataset

* Quickdraw: https://quickdraw.withgoogle.com/ 
* Quickdraw dataset: https://quickdraw.withgoogle.com/data
* Quickdraw API: https://quickdraw.readthedocs.io/en/latest/index.html

* All valid categories are here : https://github.com/googlecreativelab/quickdraw-dataset/blob/master/categories.txt

**More reading**

* https://opensource.googleblog.com/2018/11/introducing-web-component-and-data-api-for-quick-draw.html

## Run these calls first

The two cells below need to be executed first. They install the Quickdraw API and load the needed Python modules.

In [None]:
!curl -s -o setup.sh https://raw.githubusercontent.com/tproffen/ORCSGirlsPython/master/DoodleMining/Helpers/setup_activity1.sh
!bash setup.sh

from Helpers.helpers import *

## Retrieving drawings
### Drawing the image of a single doodle

First we retrieve a number of drawings of spiders (default is 1000, but we change that later). Then we get the image of one and display it. For a list of all objects it knows, click <a href="https://github.com/googlecreativelab/quickdraw-dataset/blob/master/categories.txt" target="_blank">here</a>.

In [None]:
what="spider"
doodles = QuickDrawDataGroup(what)

We separated this, because we want to retrieve the 1000 spiders only once, but rerunning the cell below allows us to look at different spiders. Also note, the command `drawing.get_image()` has optional parameters. To draw a red spider with thicker lines on yellow, you could change it to `drawing.get_image(stroke_color=(255, 0, 0), stroke_width=3, bg_color=(255, 255, 0))`

In [None]:
 doodle = doodles.get_drawing()
 image = doodle.get_image()

plt.figure()
plt.imshow(image)
plt.axis("off")
plt.show()

print ("Country: ",doodle.countrycode)
print ("Date",datetime.fromtimestamp(doodle.timestamp))
print ("Number of strokes: ",len(doodle.strokes))

### Displaying multiple doodles

Imagine you want to see multiple doodles on one plot with a title which country every doodle is from. It is as simple as adding a `for` loop (and some plotting magic).

In [None]:
plt.figure(figsize=(15, 10)) # Makes the plot area bigger

# How many rows and columns of doodles we want
rows = 3
cols = 5

for i in range(rows * cols):
  
  doodle = doodles.get_drawing()
  doodleimg = doodle.get_image(stroke_color=(0, 0, 255), stroke_width=3, bg_color=(255, 220, 220))

  plt.subplot(rows, cols, i+1)
  plt.imshow(doodleimg)
  plt.title("Country: " + doodle.countrycode)
  plt.axis('off')

plt.show()


### Sorting them by country

Since we know the country each doodle came from, we can compare drawing styles :) We loop over the countries we want to show and use the `search_drawings` method to get drawings for each country. For a list of valid country codes click <a href="https://www.iban.com/country-codes" target="_blank">here</a>.

In [None]:
what = "bread"
doodles = QuickDrawDataGroup(what, max_drawings=5000, recognized=True)


In [None]:
countries = ["US", "JP", "FR", "CL"]
cols = 7  

plt.figure(figsize=(20, 15))

index=1
for country in countries:
  doodles_country=doodles.search_drawings(countrycode=country)
  random.shuffle(doodles_country)

  for col in range(cols):
    img = doodles_country[col].get_image(stroke_color=(255,0,0), stroke_width=3, bg_color=(240,240,240))
    plt.subplot(len(countries), cols, index)
    plt.imshow(img)
    plt.title(country)
    plt.axis('off')
    index+=1;

plt.show()

## Let's try an animation of a doodle

In [None]:
# Get a spider

what="spider"

qd = QuickDrawData()
spider=qd.get_drawing(what)

In [None]:
# Using the turtle grapics (from Artistic Math) we redraw the doodle

initializeTurtle()
bgcolor('purple')
color('white')
width(3)
showturtle()

for stroke in spider.strokes:
  color(color_random())
  penup()
  for x,y in stroke:
    goto(300+x,200+y)
    pendown()

show()

## Project: How do people draw circles

See if we can determine if a circle is drawin clock wise or counter clock wise and see if there are differences by country.

In [None]:
what = "circle"
all = QuickDrawDataGroup(what, max_drawings=50000, recognized=True)

In [None]:
# From https://en.wikipedia.org/wiki/Curve_orientation#Orientation_of_a_simple_polygon

def winding(stroke):
  if (len(stroke) < 3):
    return 0
  else:
    (xa,ya) = stroke[0]    
    (xb,yb) = stroke[1]    
    (xc,yc) = stroke[2]
    return (xb*yc+xa*yb+ya*xc) - (ya*xb+yb*xc+xa*yc)

In [None]:
countries = ["US", "JP", "DE", "TW", "KR"]
countries_cw = []

for country in countries:
  doodles=all.search_drawings(countrycode=country)
  count=0
  for doodle in doodles:
    w=winding(doodle.strokes[-1])
    if (w>0):
      count +=1
  countries_cw.append(100*count/len(doodles))

print(countries, countries_cw)

In [None]:
plt.figure(figsize=(12, 8))
    
pos = list(range(len(countries_cw)))
plt.bar(pos, countries_cw)
plt.xticks(pos, list(countries), rotation=90)    
plt.show()


Let us plot the circles from US and JP separate for clock wise and counter clock wise circles. We need to run the loop again and save the strokes in different arrays for the two cases and two countries. To make things easier to use, we create two new functions: One to sort the doodles into cw and ccw and one to draw the circles using the turtle.

In [None]:
# Let us turn sorting and plotting info functions, so we can reuse them as needed.

# Function sort_doodles 
# Sort the doodles into cw and ccw

def sort_doodles(doodles):
  cw = []
  ccw = []

  for doodle in doodles[:30]:
    w=winding(doodle.strokes[-1])
    if (w>0):
      cw.append(doodle.strokes[-1])
    else:
      ccw.append(doodle.strokes[-1])
  return(cw, ccw)

# Function draw_doodles
# Draw doodles using the turtle. Needs to be called
# after initializeTurtle.

def draw_doodles(x_start,y_start,col,strokes):
  for stroke in strokes:
    color(col)
    penup()
    for x,y in stroke:
      goto(x_start+x,y_start+y)
      pendown()


# Making the master plot using our new functions

initializeTurtle()

doodles=all.search_drawings(countrycode="US")
(us_cw, us_ccw) = sort_doodles(doodles)
draw_doodles( 10, 10,'red',us_cw)
draw_doodles(400, 10,'blue',us_ccw)
print ("US: cw:",len(us_cw)," ccw:",len(us_ccw))

doodles=all.search_drawings(countrycode="JP")
(jp_cw, jp_ccw) = sort_doodles(doodles)
draw_doodles( 10,300,'red',jp_cw)
draw_doodles(400,300,'blue',jp_ccw)
print ("JP: cw:",len(jp_cw)," ccw:",len(jp_ccw))

show()