# Bigtooth: Using Bluetooth signals to find life patterns
<em>Paul Gellerman, Will Sankey</em>

## Introduction
Bluetooth is one of many signal types that consistently permeate our daily lives. Every individual carrying a smartphone on their person has the ability to broadcast over Bluetooth -- and many individuals are broadcasting without knowing. This Fall 2016 District Data Labs Incubator project used this freely accessible information to assess patterns of life on a reoccurring commute to answer the basic question: how many strangers do I normally commute with?

To accomplish this task we captured the timestamp and unique id of bluetooth sources along the same geographic route for 22 days. The project consists of a Raspberry Pi 3 loaded with Blue Hydra(https://github.com/pwnieexpress/blue_hydra). It uses a SENA UD100(https://www.amazon.com/Sena-UD100-Bluetooth-Class1-Adapter/dp/B01BHD7WR2) as the bluetooth adapter, the Raspbery Pi 3 onboard wifi, an Ubertooth dongle (not required. We eliminated noise from the dataset and found roughly YY% of the same individuals appeared along the route roughly ZZ% of the time.

The following provides snippets of the most critical parts of our code, for the full thing please see our repository here (https://github.com/DistrictDataLabs/bigtooth).  Continued development of this code is taking place at (https://github.com/pcgeller/bigtooth). 


## A quick note on legality

Protecting sensitive information is important and we have taken steps to obscure the data we collected; at no point was any individual subject to having their particular information leak out. Plus nearly all the information we collected consisted of what kind of device was being used, where it was being used, the time, and a unique Id for the device that allowed us to see individuals across time.

In essence the data we collected is very similar in nature to wardriving (https://en.wikipedia.org/wiki/Wardriving).  A great deal of the research we were able to conduct about the legality of this activity pointed back to a passage that can be linked to a specific forum post on Politech.  This post is purported to be written by an FBI Agent named %% %%%%.  There the following is stated:  

*Identifying the presence of a wireless network may not be a
criminal violation, however, there may be criminal violations if the network is actually accessed including theft of services, interception of communications, misuse of computing resources, up to and including violations of the Federal Computer Fraud and Abuse Statute, Theft of Trade Secrets, and other federal violations.”* [Link to reference.](https://books.google.com/books?id=rmUV1vluO88C&pg=PA311&lpg=PA311&dq=Identifying+the+presence+of+a+wireless+network+may+not+be+a+criminal+violation,+however,+there+may+be+criminal+violations+if+the+network+is+actually+accessed+including+theft+of+services,+interception+of+communications,+misuse+of+computing+resources,+up+to+and+including+violations+of+the+Federal+Computer+Fraud+and+Abuse+Statute,+Theft+of+Trade+Secrets,+and+other+federal+violations&source=bl&ots=aWZCCJTegj&sig=1hwMPKM_eWLeBkh-vJoeEO-9kGY&hl=en&sa=X&ved=0ahUKEwjj-4-yyPLQAhXML8AKHURQC2oQ6AEIHTAA#v=onepage&q=Identifying%20the%20presence%20of%20a%20wireless%20network%20may%20not%20be%20a%20criminal%20violation%2C%20however%2C%20there%20may%20be%20criminal%20violations%20if%20the%20network%20is%20actually%20accessed%20including%20theft%20of%20services%2C%20interception%20of%20communications%2C%20misuse%20of%20computing%20resources%2C%20up%20to%20and%20including%20violations%20of%20the%20Federal%20Computer%20Fraud%20and%20Abuse%20Statute%2C%20Theft%20of%20Trade%20Secrets%2C%20and%20other%20federal%20violations&f=false)

Durimg the course of this project the bigtooth device only captured data from devices using inquiry, scan, and low energy inquery modes.  These modes are standard services for any Bluetooth master to find slave devices.  An extended inquiry response will transfer additional data from a slave to a master after a request from the master device.  However since this is within the explicent functionality of the services we believe it's in line with the principles stated above.  

All of the technical methods used in the project are believed to be legal within the parameters laid out above.  However, the abiguity surrounding the legality should be of note in itself.  This legal grey area could hinder future research and development into the Bluetooth space.  

## Methodology

The bigtooth project was designed to conduct pattern of life analysis using captured Bluetooth signals.  Pattern of life analysis is a technique used to create an activity profile for an individual or georgraphic area.  In this analysis an analyst will try to establish the *who, what, where, when, why* for a geographic area or individual.  This project focused on the pattern of life of a specific geographic area (the route) (*where*) and the inviduals that traversed the area (*who*) at specific times (*when*).  In order to track the who along the route the team first associated inviduals to a device.  

This association was made through inductive reasoning.  The premise that a cell phone is typically owned and used by a single individual and the exceptions to this are rare should not be controversial.  To extend the premise futher; devices that are used with a cellular device are therefore also associated to an individual.  From these premises it can be strongly argued that some, but not all, Bluetooth signals can be associated with an individual person.  

As mentioned above this is not always the case. Some signals can be associated with devices that are stationary.  These devices are not associated with an indivdiual and more likely to be associated with a location.  In order to clean the data set these stationary devices were removed - how this was done will be dsicussed in greater detail under the data section.

## A few notes on Bluetooth

Bluetooth has become a ubiquitious service for short range communication between devices.  

This data can be made useful in several ways:

1. It provides a unique id and identifies individual device;
2. It can be coupled with a timestamp
3. There are other fields available for capturing

For more information about Bluetooth check the following links: 
[Bluetooth Special Interest Group](http://bluetooth.com)
[Rover Labs - adoption rates](http://blog.roverlabs.co/post/117195525589/the-straight-goods-on-bluetooth-how-many)

## The bigtooth Project

To implement the methodology and explore the Bluetooth space we needed hardware that was capable of capturing Bluetooth signals.  Commercial devices are available however they can cost upwards of $1,000.  Since this price was outside of the budget for the project a different solution was needed.  

### The Device
The bigtooth device consists of several pieces of software:
* Blue Hydra - Bluetooth monitoring software written by Pwnie Labs
* Python Scripts - these were mostly for the analysis/wrangling of the data but also to conduct some other functions like moving, saving, and renaming databases.  
* Bluez - standard Bluetooth driver stack for Linux.  Includes hcitool - a handy command for running Bluetooth commands outside of Blue Hydra.
* Raspian - we used a Raspberry Pi for the project and didn't think it was necessary to change the 


#### The tech
Software
Blue Hydra
Python scripts
Bluez
Raspian

Hardware
Raspberry Pi 3
SENA Bluetooth adapter
Ubertooth
Battery Pack
Really Big Antenna


### The route
A map.

## Cleaning the data

Once all the data was collected we use a SQL database approach..

# Analyzing it

# Results and discussion

In [None]:
## Other uses, future research

In [None]:
Market research:  See count and variety of devices in an area.  Use this to determine protocol adoption rates or market saturation.
People tracking: Determine the route of a device through a sensor net.  Walk down a hotel room corridor to narrow down what room a device is in.  
Attack surface mapping: Survey BT devices to determine possible targets. 
Traffic engineering: Analyze traffic flows and volume.
Security: Look for devices that are associated with known bad actors.

