# Annotate shodan.io webcam images

This project pulls random internet exposed webcam images from shodan and labels them with Google Vision AI.

## Requirements

- Shodan python library
- Google cloud Vision python API library
- Pillow for image manipulation

The last two are available in the default python3 notebook in GCP, so just install shodan

In [1]:
!pip install shodan

Collecting shodan
  Downloading shodan-1.25.0.tar.gz (51 kB)
     |████████████████████████████████| 51 kB 225 kB/s             
[?25h  Preparing metadata (setup.py) ... [?25ldone
Collecting click-plugins
  Downloading click_plugins-1.1.1-py2.py3-none-any.whl (7.5 kB)
Collecting XlsxWriter
  Downloading XlsxWriter-3.0.2-py3-none-any.whl (149 kB)
     |████████████████████████████████| 149 kB 15.0 MB/s            
Building wheels for collected packages: shodan
  Building wheel for shodan (setup.py) ... [?25ldone
[?25h  Created wheel for shodan: filename=shodan-1.25.0-py3-none-any.whl size=46413 sha256=da23bdadb714d26995324b9d6c1212078391840c6cd1429ccf2b674731f895b4
  Stored in directory: /home/jupyter/.cache/pip/wheels/61/34/9b/3e7801d3749313a89526c0ba318c1b4a66c93db6ba464983e4
Successfully built shodan
Installing collected packages: XlsxWriter, click-plugins, shodan
Successfully installed XlsxWriter-3.0.2 click-plugins-1.1.1 shodan-1.25.0


## Get shodan data

Shodan uses a [cli](https://help.shodan.io/command-line-interface/0-installation) to download a `*.json.gz` file. PAss it a query and a limit to filter your results.

You will need to initialize shodan cli with [your API key](https://account.shodan.io/)

```
pip install shodan
shodan init <APIKEY>
shodan download /tmp/webcam.json.gz --limit 20 has_screenshot:1 screenshot.label:webcam
```

In [70]:
import shodan
import shodan.helpers as helpers

import os
import sys
import io

import base64
from google.cloud import vision
from PIL import Image, ImageDraw, ImageFont

In [112]:
class img:
    """
    A class with info about an image
    """
    def __init__(self,name='',filename='',labels=None,objects=None):
        self.name=name
        self.filename=filename
        self.labels=labels
        self.objects=objects
        
    def print(self):
        l=[i.description + ' (%.2f%%)' % (i.score*100.) for i in self.labels]
        o=[ i.name + ' (%.2f%%)' % (i.score*100.) for i in self.objects ]
        print('{} ({})\n\tLabels: {}\n\tObjects: {}'.format( self.name,self.filename,','.join(l),','.join(o) ) )
        
    def obj_annotate_and_write(self):
        img = Image.open(self.filename) #convert the image to PIL
        width, height  = img.size
        for object_ in self.objects: 
            vects = object_.bounding_poly.normalized_vertices
            # denormalize the vertices
            x0, y0 = vects[0].x * width, vects[0].y * height # Bottom Left vertex
            x1, y1 = vects[1].x * width, vects[1].y * height # Bottom Right vertex
            x2, y2 = vects[2].x * width, vects[2].y * height # Top Right vertex
            x3, y3 = vects[3].x * width, vects[3].y * height # Top Left vertex

            draw = ImageDraw.Draw(img)
            draw.line([
            x0, y0,
            x1, y1,
            x2, y2,
            x3, y3,
            x0, y0], width=4 ,fill=None)

            font = ImageFont.truetype('/usr/share/fonts/truetype/dejavu/DejaVuSans-Bold.ttf', 16)
            draw.text((x0 + 10, y0),
                      font=font,text=object_.name, fill='white')
        filename_noext='.'.join(self.filename.split('.')[0:-1])
        ext=self.filename.split('.')[-1]
        img.save(filename_noext + '_annotated.'+ext)

In [113]:
input_file = '/tmp/webcam.json.gz'
output_dir = '/home/jupyter/webpics'
client = vision.ImageAnnotatorClient()

# Make sure the directory exists
if not os.path.exists(output_dir):
    os.mkdir(output_dir)

images=[]
for banner in helpers.iterate_files(input_file):
    # Try to grab the screenshot from the banner
    screenshot = helpers.get_screenshot(banner)

    # If we found a screenshot then create a file w/ the data
    if screenshot:
        # Create the file handle
        image_file = open('{}/{}.jpg'.format(output_dir, banner['ip_str']), 'wb')
        # Write the image data which is stored using base64 encoding
        #image.write(screenshot['data']) #.decode('base64'))
        image_file.write( base64.b64decode(screenshot['data']) )
        
        image = vision.Image(content=base64.b64decode(screenshot['data']) )
        response = client.label_detection(image=image)
        labels = response.label_annotations
        objects = client.object_localization(image=image).localized_object_annotations

        ii=img(name=banner['ip_str'], filename='{}/{}.jpg'.format(output_dir, banner['ip_str']),
               labels=labels,objects=objects)
        images.append(ii)
        ii.print()
        if len(objects)>0:
            ii.obj_annotate_and_write()

176.221.104.30 (/home/jupyter/webpics/176.221.104.30.jpg)
	Labels: Automotive lighting (91.84%),Asphalt (82.26%),Flash photography (80.48%),Road surface (80.34%),Building (78.76%),Automotive tire (77.98%),Tints and shades (77.21%),Bumper (76.64%),Automotive exterior (76.14%),Headlamp (74.23%)
	Objects: 
187.233.195.99 (/home/jupyter/webpics/187.233.195.99.jpg)
	Labels: Hood (91.23%),Automotive lighting (87.22%),Automotive tire (86.68%),Font (82.48%),Line (81.62%),Bumper (80.21%),Fender (80.03%),Rectangle (79.05%),Vehicle door (78.74%),Automotive exterior (78.69%)
	Objects: Animal (64.50%),Mirror (50.05%)
87.79.193.193 (/home/jupyter/webpics/87.79.193.193.jpg)
	Labels: Window (91.62%),Grey (84.31%),Black-and-white (84.03%),Style (83.87%),Automotive lighting (81.39%),Flash photography (80.47%),Tints and shades (77.26%),Glass (76.15%),Monochrome (73.72%),Monochrome photography (72.96%)
	Objects: Animal (71.20%)
190.151.78.69 (/home/jupyter/webpics/190.151.78.69.jpg)
	Labels: Wood (81.79%)