# Teeview Analytics

The purpose of this project is to build a simple analytics tool that allows us to keep a tab on the Teespring campaigns indexed by teeview that have potential to be successfull in order to get inspiration only from this subset. 

The final outcome should initially be a bot (think Slack) that informs us when a new potential successfull campaign is identified in order to jump on it right away and draw inspiration from it. The purpose of the bot is to get near real time notifications since in this cases timing is fundamental.

The principles that will be used to identify the campaigns that have success potential which are the ones we need to draw attention to is basic physics principles. We will get the velocity of the campaign and its acceleration using the numerical derivative. This will produce 3 plots which are the position, velocity and acceleration of a campaign. While a successfull campaign has a good final position it should have had high velocity values and high acceleration values to reach such velocities. This is the case because the length of the campaigns is fixed. That is the equivalent to having a sprint in runners where there is a defined distance beforehand.

Given this context the goal is to identify potentially successfull campaigns i.e. noticing campaigns that have the potential to be successfull before they actually succeed (since one they succeed the market will be saturated of that design and as such it will no longer represent an opportunity).

In order to make this identification one needs to check the acceleration value in conjunction with the velocity. Think that there are at least 2 cases in which you would definitely would want to get a piece of the action.

1. Acceleration is practically null or even negative but velocity is still quite high.
2. Velocity is still slow but acceleration is really high.

There should be a coefficient that would indicate if we should jump on a campaign or not. To be developed.

### Crawler

First step is to crawl [https://www.teeview.org] and get the latest campaigns. Let's define the "latest campaigns" as the ones which have been added during the last day. We should only select those which say:

1. "31 sold, available until [Thursday!]"
2. "Only 3 more needed to print!" 

In order to do this we need to get the url of the campaign and go to [http://www.teespring.com] to see if this message appears on the campaign. If it does then we have data to populate our database and this url goes into the list of urls which we will follow periodically along with the time ago it was added, we will run the script periodically to get the sales data and be able to make the position, velocity and acceleration plots.

#### Import Modules

In [27]:
import requests
from bs4 import BeautifulSoup
import sqlite3

#### Create Database Tables

There should be 2 database tables. 

One of them will store the information of relevant campaigns, a relevant campaign is defined as that which is obtained through the "teeview_scraper" function and posteriorly filtered through the "teeview_data_filter" function (which gets all campaigns that are added within one day ago and filters only the ones which report sales data. This table will be called "Campaigns"

The second database table will consist of the entries produced by the "teespring_data_updater" function which will query Teespring for each campaign in "Campaigns" and add the latest sales data as an entry to the table. This will be the data that will be used to plot position, velocity and acceleration. This table will be called "SalesData"

The database as a whole should be cleaned every day or so to keep it short and running smoothly. The cleaning process will consist of going through the database to see which campaigns have ended and removing the respective rows from the "Campaigns" and "SalesData"   

In [28]:
conn = sqlite3.connect('teeview_analytics.db')
len(conn.execute("SELECT name FROM sqlite_master WHERE type='table' AND name='campaigns'").fetchall())

1

In [29]:
# Connect to "teeview_analytics" database
conn = sqlite3.connect('teeview_analytics.db')
# Create "campaigns" table if it does not exist
campaigns_table = conn.execute("SELECT name FROM sqlite_master WHERE type='table' AND name='campaigns'").fetchall()
if len(campaigns_table) is 0: conn.execute("create table campaigns(url, img_src, name, time_ago)")
# Create "sales_data" table if it does not exist
campaigns_table = conn.execute("SELECT name FROM sqlite_master WHERE type='table' AND name='sales_data'").fetchall()
if len(campaigns_table) is 0: conn.execute("create table sales_data(campaign_url, sales, timestamp)") 
# Close connection
conn.close()

In [30]:
conn = sqlite3.connect('teeview_analytics.db')
len(conn.execute("SELECT name FROM sqlite_master WHERE type='table' AND name='campaigns'").fetchall())

1

#### Scrape Data From Teeview

In [31]:
# This function will make successive get requests to teeview and return the new campaigns that were added
# within the last day or sooner than the latest campaign saved on the last query latest_campaign_link
def teeview_scraper():
    # Set variables of: first "page" to query, query while data is "whithin_day" and array to store "teeview_data" 
    page = 1
    within_day = True
    teeview_data = []
    # Loop all campaign that were added within one day ago
    while within_day:
        # Form url of teeview made up of active campaigns on page "page" 
        url = "https://www.teeview.org/site/index?active=true&page={0}&per-page=12".format(page)
        # Make request to get the page of teeview and parse it
        response = requests.get(url)
        html = BeautifulSoup(response.text, 'html.parser')
        # Get links of teespring campaigns of the page queried
        for thumbnail in html.select(".thumbnail"):
            campaign_url = thumbnail.select("h3 a")[0]['href']
            campaign_time_ago = thumbnail.select("p.text-muted small")[0].getText()
            # Break the loop if we have reached the campaigns that were added longer than a day ago
            if "day" in campaign_time_ago:
                within_day = False
                break
            # Break the loop if we have reached the latest_campaign_queried
                # //>> Code goes here to get latest_campaign_link from database and check if link matches,
                # //>> if it does set within_day to False and break loop
            # Add the campaign data to teeview_data
            teeview_data.extend([[campaign_url, campaign_time_ago]])
        # Increase page counter
        page += 1
    # Returns the gathered "teeview_data"
    return teeview_data

In [32]:
# This function will go through the new campaigns returned by the "teeview_scraper" function
# and for each of them request the Teespring campaign in order to find out if it is relevant
# or not based on the fact if the campaign reports sales data or not. If it does it adds the
# new campaign to the respective database table.
def teeview_data_filter():
    # //>> Code goes here
    return True

In [33]:
# This function will go through the database table that stores the Teespring campaigns that
# report sales data and query each of them on Teespring to get the latest info on sales which
# will be stored as a new entry on the database table that stores the sales or "position" data
# of each relevant campaign.
def teespring_data_updater():
    # //>> Code goes here
    return True