# Stack Exchange API in Python

By Sebastian Shirk and Avery Fernandez

The Stack Exchange API provides programmatic access to data from Stack Exchange network sites like Stack Overflow, Server Fault, and Super User, allowing developers to interact with questions, answers, users, and more.

Please see the following resources for more information on API usage:

- Documentation
    - <a href="https://api.stackexchange.com/docs" target="_blank">Stack Exchange API</a>
    - <a href="https://api.stackexchange.com/docs/authentication" target="_blank">Stack Exchange API Authentication</a>
- Terms
    - <a href="https://stackoverflow.com/legal/api-terms-of-use" target="_blank">Stack Exchange API Terms of Use</a>
    - <a href="https://stackoverflow.com/legal/terms-of-service/public#licensing" target="_blank">Stack Overflow Public Network Terms of Service</a>
    - <a href="https://api.stackexchange.com/docs/throttle" target="_blank">Stack Exchange API Throttling and Rate Limits</a>
- Data Reuse
    - <a href="https://stackoverflow.com/legal/acceptable-use-policy" target="_blank">Stack Exchange Acceptable Use Policy</a>

_**NOTE:**_ The Stack Exchange API has specific throttling measures:
- **IP-based throttle:** Maximum of 30 requests per second per IP address. Exceeding this limit results in temporary banning, typically from 30 seconds to several minutes.
- **Daily quotas:**
  - Applications without an `access_token` share an IP-based daily quota, which defaults to 10,000 requests per day.
  - Applications with a valid `access_token` have a distinct user/app pair quota, also defaulting to 10,000 requests per day.

*These recipe examples were tested on May 7, 2025.*

## Setup

The following external libraries need to be installed into your environment to run the code examples in this tutorial:
* <a href="https://github.com/psf/requests" target="_blank">requests</a>
* <a href="https://github.com/ipython/ipykernel" target="_blank">ipykernel</a>
* <a href="https://github.com/matplotlib/matplotlib" target="_blank">matplotlib</a>
* <a href="https://github.com/pandas-dev/pandas" target="_blank">pandas</a>

We import the libraries used in this tutorial below:

In [4]:
import requests
import datetime
import pandas as pd
from pprint import pprint

## 1. Get Questions Based on Tags

Change the `tag` variable to get questions based on different tags.

In [5]:
BASE_URL = "https://api.stackexchange.com/2.3/"
tag = 'python'
endpoint = 'questions'
params = {
    'order': 'desc',
    'sort': 'activity',
    'tagged': tag,
    'site': 'stackoverflow'
}

try:
    response = requests.get(BASE_URL + endpoint, params=params)
    # Raise an error for bad responses
    response.raise_for_status()  
    data = response.json()
    pprint(data, depth=1)
except requests.exceptions.RequestException as e:
    print(f"Error fetching data: {e}")
    data = None

{'has_more': True, 'items': [...], 'quota_max': 300, 'quota_remaining': 298}


In [6]:
def convert_unix(unix_time):
    # Convert Unix timestamp to a human-readable format
    return datetime.datetime.fromtimestamp(unix_time).strftime('%Y-%m-%d %H:%M:%S')

if data:
    # Extract relevant information from the response
    questions = data.get('items', [])
    
    # Grab the first question
    for question in questions[:5]:
        question_last_activity = convert_unix(question.get('last_activity_date'))
        question_creation = convert_unix(question.get('creation_date'))
        question_title = question.get('title')
        question_link = question.get('link')

        print(f"Title: {question_title}")
        print(f"Link: {question_link}")
        print(f"Last Activity: {question_last_activity}")
        print(f"Creation Date: {question_creation}")
        print("-" * 80)

Title: Not able to connect to Selenium Chrome service
Link: https://stackoverflow.com/questions/79610810/not-able-to-connect-to-selenium-chrome-service
Last Activity: 2025-05-07 14:18:21
Creation Date: 2025-05-07 09:38:55
--------------------------------------------------------------------------------
Title: What is the difference between pyenv, virtualenv, and Anaconda?
Link: https://stackoverflow.com/questions/38217545/what-is-the-difference-between-pyenv-virtualenv-and-anaconda
Last Activity: 2025-05-07 14:18:11
Creation Date: 2016-07-06 01:31:10
--------------------------------------------------------------------------------
Title: Using Pip to install packages to an Anaconda environment
Link: https://stackoverflow.com/questions/41060382/using-pip-to-install-packages-to-an-anaconda-environment
Last Activity: 2025-05-07 14:12:38
Creation Date: 2016-12-09 06:20:55
--------------------------------------------------------------------------------
Title: What is the difference between pi

## 2. Get Questions Based on Title

Change the `title` variable to get questions based on different titles.

In [7]:
title = 'How to use Python'
endpoint = 'search'
params = {
    'order': 'desc',
    'sort': 'activity',
    'intitle': title,
    'site': 'stackoverflow'
}

try:
    response = requests.get(BASE_URL + endpoint, params=params)
    # Raise an error for bad responses
    response.raise_for_status()  
    data = response.json()
    pprint(data, depth=1)
except requests.exceptions.RequestException as e:
    print(f"Error fetching data: {e}")
    data = None

{'has_more': True, 'items': [...], 'quota_max': 300, 'quota_remaining': 297}


In [8]:
if data:
    # Extract the items from the response
    search_results = data.get('items', [])
    
    for search_result in search_results[:5]:
        search_result_last_activity = convert_unix(search_result.get('last_activity_date'))
        search_result_creation = convert_unix(search_result.get('creation_date'))
        search_result_title = search_result.get('title')
        search_result_link = search_result.get('link')

        print(f"Title: {search_result_title}")
        print(f"Link: {search_result_link}")
        print(f"Last Activity: {search_result_last_activity}")
        print(f"Creation Date: {search_result_creation}")
        print("-" * 80)

Title: How to use Python Unittest TearDownClass with TestResult.wasSuccessful()
Link: https://stackoverflow.com/questions/25630630/how-to-use-python-unittest-teardownclass-with-testresult-wassuccessful
Last Activity: 2025-05-05 02:03:15
Creation Date: 2014-09-02 14:11:33
--------------------------------------------------------------------------------
Title: Can or How to use Python asyncio on Google Cloud Functions?
Link: https://stackoverflow.com/questions/52755996/can-or-how-to-use-python-asyncio-on-google-cloud-functions
Last Activity: 2025-04-16 12:21:10
Creation Date: 2018-10-11 03:51:31
--------------------------------------------------------------------------------
Title: How to use python descriptors with default_factory in dataclass
Link: https://stackoverflow.com/questions/76090498/how-to-use-python-descriptors-with-default-factory-in-dataclass
Last Activity: 2025-04-06 17:01:34
Creation Date: 2023-04-24 04:31:08
---------------------------------------------------------------

## 3. Get Questions Based on Similarity

Change the `title` variable to get questions based on different titles.

In [9]:
title = 'How to use Python'
endpoint = 'similar'
params = {
    'order': 'desc',
    'sort': 'activity',
    'title': title,
    'site': 'stackoverflow'
}

try:
    response = requests.get(BASE_URL + endpoint, params=params)
    # Raise an error for bad responses
    response.raise_for_status()  
    data = response.json()
    pprint(data, depth=1)
except requests.exceptions.RequestException as e:
    print(f"Error fetching data: {e}")
    data = None

{'has_more': True, 'items': [...], 'quota_max': 300, 'quota_remaining': 296}


In [10]:
if data:
    # Extract the items from the response
    similar_questions = data.get('items', [])

    for similar_question in similar_questions[:5]:
        last_activity = convert_unix(similar_question.get('last_activity_date'))
        similar_question_creation = convert_unix(similar_question.get('creation_date'))
        similar_question_title = similar_question.get('title')
        similar_question_link = similar_question.get('link')

        print(f"Title: {similar_question_title}")
        print(f"Link: {similar_question_link}")
        print(f"Last Activity: {last_activity}")
        print(f"Creation Date: {similar_question_creation}")
        print("-" * 80)

Title: C how to open/close a file in a function in a endless loop
Link: https://stackoverflow.com/questions/79611253/c-how-to-open-close-a-file-in-a-function-in-a-endless-loop
Last Activity: 2025-05-07 14:21:45
Creation Date: 2025-05-07 13:56:57
--------------------------------------------------------------------------------
Title: Sort array by filenames independent of path
Link: https://stackoverflow.com/questions/79611186/sort-array-by-filenames-independent-of-path
Last Activity: 2025-05-07 14:21:17
Creation Date: 2025-05-07 13:10:34
--------------------------------------------------------------------------------
Title: Common parent of DefaultMutableTreeNode collection
Link: https://stackoverflow.com/questions/79610079/common-parent-of-defaultmutabletreenode-collection
Last Activity: 2025-05-07 14:21:10
Creation Date: 2025-05-07 03:17:37
--------------------------------------------------------------------------------
Title: GoHighLevel Custom Payment Provider — JS in payment.html n

## 4. Get Answers and Comments to a Question

Get all answers to a question based on the question ID. Question ID can be found using the above methods.

Comments appear above the answers in the output.

A filter is used on this query. Filters can be found <a href="https://api.stackexchange.com/docs/create-filters" target="_blank">here</a>.  

Hit `Run` to get a json output of all the included fields in the default filter. You can then choose to add fields that aren't there or exclude fields that you don't need.  

Once you input all of your included and excluded fields, hit `Run` again and scroll the json all the way to the bottom. You should see `"filter": "{string}"` where the string is the filter.  

Copy and paste this filter into your code to use it. 

Here we have already generated a filter for us to use.

In [11]:
# Get all answers to a specific question using a question id
question_id = 33180743
filter = '!22ZfkjBmMx*-LZeN4vanL'
endpoint = f'questions/{question_id}/answers'
params = {
    'order': 'desc',
    'sort': 'activity',
    'site': 'stackoverflow',
    'filter': filter
}

try:
    response = requests.get(BASE_URL + endpoint, params=params)
    # Raise an error for bad responses
    response.raise_for_status()  
    data = response.json()
    pprint(data, depth=1)
except requests.exceptions.RequestException as e:
    print(f"Error fetching data: {e}")
    data = None

{'has_more': False, 'items': [...], 'quota_max': 300, 'quota_remaining': 295}


In [12]:
if data:
    # Extract the items from the response
    answers = data.get('items', [])
    
    for answer in answers[:5]:
        answer_last_activity = convert_unix(answer.get('last_activity_date'))
        answer_creation = convert_unix(answer.get('creation_date'))
        answer_body = answer.get('body')
        answer_link = answer.get('link')

        print(f"Answer Body: {answer_body}")
        print(f"Link: {answer_link}")
        print(f"Last Activity: {answer_last_activity}")
        print(f"Creation Date: {answer_creation}")
        print("-" * 80)

Answer Body: <p>For someone testing on Emulator, try choosing an emulator that supports Google Play, then Sign In on Google Play</p>

Link: None
Last Activity: 2024-08-07 16:07:33
Creation Date: 2024-08-07 16:07:33
--------------------------------------------------------------------------------
Answer Body: <p>I get In-app billing version 3 NOT supported error when the user is not signed into google play.  Ensure a user is logged into google play on the device.</p>
<p>Update 2023:  Note that you might also get the error &quot;Google Play In-app Billing API version is less than 3&quot; when the user is not logged in to the play store.</p>

Link: None
Last Activity: 2023-04-14 10:57:11
Creation Date: 2017-06-18 09:36:25
--------------------------------------------------------------------------------
Answer Body: <p>Try "Clear Data" and then "Force stop" for Google Play app.</p>

Link: None
Last Activity: 2015-11-02 03:27:33
Creation Date: 2015-10-26 13:01:45
-----------------------------

## 5. Get Frequency of Most Used Tags

In this example, we use the `/tags` endpoint to find the most common tags on Stack Overflow.

In [13]:
endpoint = 'tags'
params = {
    "order": "desc",
    "sort": "popular",
    "site": "stackoverflow",
    "tagged": "python",
    "pagesize": 10
}

try:
    response = requests.get(BASE_URL + endpoint, params=params)
    # Raise an error for bad responses
    response.raise_for_status()
    data = response.json()
except requests.exceptions.RequestException as e:
    print(f"Error fetching data: {e}")
    data = None

if data:
    # Extract the items from the response
    tags = data.get('items', [])
    
    tag_names = [tag.get('name') for tag in tags]
    tag_counts = [tag.get('count') for tag in tags]

    df = pd.DataFrame({
        "Tag": tag_names,
        "Count": tag_counts
    })

    print(df.head())

          Tag    Count
0  javascript  2537249
1      python  2222620
2        java  1924466
3          c#  1628038
4         php  1469793
