# Applied Data Science — Stack Overflow 

## John Burt
### Portland Data Science Group<br/>Applied Data Science Meetup series

### Notebook purpose: demonstrate StackAPI - Python API for reading stack exchange data

The StackAPI python module lets you query and download stack exchange data. It provides and alternative method to the Stack Exchange Data Explorer for acquiring stack overflow questions, answers, comments, etc.


### Basic documentation
- https://stackapi.readthedocs.io/en/latest/


### Filters (specify what fields are returned with queries):
- http://api.stackexchange.com/docs/filters


### Install:
- pip install stackapi

## Basic query

Queries return a dict with a list of items. The items contain fields, which can be specified with the filter parameter. 

#### Note that if you don't specify "filter='withbody'", then the actual text will not be returned.

In [12]:
from stackapi import StackAPI

SITE = StackAPI('stackoverflow')
result = SITE.fetch('comments', filter='withbody')

text = [item['body'] for item in result['items']]
print(text[:5])


['Using &#39;assignment expressions&#39;:  following error:', '@MarcGlisse and for <code>-O3</code>, it&#39;s loading and comparing the <code>char</code>s as integers, probably because the compiler realizes that they can only be <code>&#39;A&#39;</code>s. Unfortunately, the naive <code>-O0</code> used vector-instructions that beat this optimization easily.', 'Wait? Is this just try to create replacement of <code>JSONDecoder</code>’s <code>decode</code> without possibility to change decoder and catch errors?', 'just do not use new, use std::unique_ptr instead:', '@MelonieRichey - No problem! If this answer has sorted out your particular question, please click the check mark at the top left of the answer to mark your question as closed.']


## Examples from the StackAPI advanced documentation

See https://stackapi.readthedocs.io/en/latest/user/complex.html

In [7]:
from datetime import datetime

# All Stack Overflow Users Created on Feb. 27th of 2011
SITE = StackAPI('stackoverflow')
SITE.max_pages=10
users = SITE.fetch('users', fromdate=datetime(2011,2,27), todate=datetime(2011,2,28))
print(len(users['items']))


616


In [8]:
# Comments with at least a score of 10 on Ask Ubuntu
SITE = StackAPI('askubuntu')
comments = SITE.fetch('comments', min=10, sort='votes')
print(len(users['items']))


616


In [9]:
# Of three specific posts on Server Fault, which one has the most recent activity
SITE = StackAPI('serverfault')
SITE.max_pages=1
SITE.page_size=1
post = SITE.fetch('posts', ids=[3743, 327738, 339426], sort='activity', order='desc')
print(post['items'][0]['post_id'])


339426


In [10]:
# Any favorites added in the month of December 2011 by Darin Dimitrov
SITE = StackAPI('stackoverflow')
from datetime import datetime
favorites = SITE.fetch('users/{ids}/favorites', min=datetime(2011, 12, 1), 
                       max=datetime(2012, 1, 1), sort='added', ids=[29407])
print(len(users['items']))


616


In [11]:
# Questions created during the Modern Warfare 3 VS Skyrim Contest with the skryim tag 
#  and a score greater than 10 on Gaming Stack Exchange
SITE = StackAPI('gaming')
from datetime import datetime
questions = SITE.fetch('questions', fromdate=datetime(2011,11,11), 
                       todate=datetime(2011,11,19), min=10, sort='votes', tagged='skyrim')
len(questions['items'])


239