# Etsy t-shirt listings dataset

## Why this dataset is interesting?

Etsy is a marketplace where people sell handcrafted (some of the time) goods.

Marketplaces are interesting because it's possible to find business opportunities if you study them.

Signs you're seeing a possible business opportunity include... 
* Lots of sellers selling similar products with no single market leader
* High *view* counts with *low* average product rating
* High *rating* counts across very similar products

Etc...


## Ideas for questions this data can help you answer

* What kinds of T-Shirts gets the most search traffic (view count)
* How does price, seller reviews, etc... effect views? What about favorites?
* Feel like doing some text mining?.. is there anything in the product descriptions that correlates with seller reviews, average reviews or view count?
* How could you take these insights and build a product line of your own?

## Where you can go to get this kind of data

I wrote a crawler to collect this dataset. The crawler was writen in python using [Scrapy](https://scrapy.org/).

If you're going to write a crawler it's best to make sure it's ok to do the crawl... so...

In [2]:
import urllib.robotparser

rp = urllib.robotparser.RobotFileParser()
rp.set_url("https://www.etsy.com/robots.txt")
rp.read()
rp.crawl_delay("*")

test_crawl_url = "https://www.etsy.com/listing/478395857/big-little-shirt-big-little-sorority"

can_crawl_listings = rp.can_fetch("*", test_crawl_url)
print("We can crawl etsy? {0}".format(can_crawl_listings))

We can crawl etsy? True


**Sweet! We can crawl the listings!

Anyway... on to the data...

## Some data - ~10500 t-shirt listings

In [4]:
import pandas as pd
etsydata = pd.read_csv('../datasets/etsy-mens-t-shirts-11-15-2017.csv')
etsydata.describe()

Unnamed: 0,listing_id,views,avg_review,product_price,favorites,seller_reviews
count,10499.0,10494.0,10421.0,10499.0,10499.0,10499.0
mean,5249.0,3168.035354,4.815125,19.33927,420.364701,1212.377179
std,3030.944572,7794.997142,0.204381,17.513279,1069.167123,2712.049925
min,0.0,2.0,1.0,0.0,0.0,0.0
25%,2624.5,348.0,4.7843,14.0,36.0,63.0
50%,5249.0,1006.0,4.8681,17.0,112.0,285.0
75%,7873.5,2839.0,4.927,20.0,359.0,1014.0
max,10498.0,248365.0,5.0,525.0,25846.0,27408.0


In [5]:
etsydata.head()

Unnamed: 0,listing_id,detail_text,seller_name,tags,url,views,avg_review,product_price,favorites,posted,seller_reviews,product_title
0,298,"\n King and Queen shirts, King 01, ...",EpicTees4You,"Related to this item,Clothing,Unisex Adult Clo...",https://www.etsy.com/listing/224976011/king-an...,248365.0,4.6235,29.0,6146.0,"Listed on Nov 15, 2016",1344.0,"King and Queen shirts, King 01, Queen 01 Coupl..."
1,5248,\n Well hello there. This shirt cau...,wethouse,"Related to this item,Clothing,Unisex Adult Clo...",https://www.etsy.com/listing/167282687/rip-lou...,184947.0,4.756,23.0,210.0,"Listed on Nov 2, 2016",1134.0,RIP Lou Reed Shirt // silver shirt
2,2737,"\n -MADE TO ORDER -,please allow 1 ...",TheArtSwallow,"Related to this item,Clothing,Unisex Adult Clo...",https://www.etsy.com/listing/199727284/sale-ti...,135820.0,4.7259,10.0,2931.0,"Listed on Nov 12, 2016",1112.0,SALE Tie Dye Pocket T Shirt- Choose your fabric!
3,378,\n I LOVE it when MY WIFE® Brand T-...,UnicornTees,"Related to this item,Clothing,Unisex Adult Clo...",https://www.etsy.com/listing/167509802/christm...,119570.0,4.8481,14.0,5064.0,"Listed on Nov 15, 2016",9765.0,Christmas Gift for Men Video Game Gifts Videog...
4,5085,"\n Cat shirt, funny tshirts, mens g...",RCTees,"Related to this item,Clothing,Men's Clothing,S...",https://www.etsy.com/listing/89209028/cat-shir...,118832.0,4.9612,24.0,18528.0,"Listed on Nov 15, 2016",4534.0,"Cat shirt, funny tshirts, mens graphic tee, me..."
