<img width="8%" alt="Instagram.png" src="https://raw.githubusercontent.com/jupyter-naas/awesome-notebooks/master/.github/assets/logos/Instagram.png" style="border-radius: 15%">

# Instagram - Get comments from post
<a href="https://bit.ly/3JyWIk6">Give Feedback</a> | <a href="https://github.com/jupyter-naas/awesome-notebooks/issues/new?assignees=&labels=bug&template=bug_report.md&title=Instagram+-+Get+comments+from+post:+Error+short+description">Bug report</a>

**Tags:** #instagram #likes #comments #snippet #content

**Author:** [Varsha Kumar](https://www.linkedin.com/in/varsha-kumar-590466305/)

**Last update:** 2024-07-10 (Created: 2024-07-10)

**Description:** This notebook allows users to extract comments from an Instagram post.

### How to retrive API key with apify

1. Go to https://apify.com.
2. Click "Sign up for free" and use your google account to sign up.
3. Once your account has been created, navigate to "Settings" on the left panel of the screen.
4. Here you will click on the tab labeled "Integrations" where your personal API token that was automatically generated with sign up will be.
5. Copy that token and use it to extract data!

## Input

### Import libraries

In [1]:
import requests
import pandas as pd
import json
import time

### Setup variables
- `apify_token`: personal token apify creates to access data
- `post_url`: link to the instagram post
- `output_csv`: excel file

In [7]:
apify_token = "apify_api_gXWnLEPiE7wC8ALUwQkJ0QcdbuQzU84xxxxx"
post_url = "https://www.instagram.com/p/Cn0cUc7KelU/"
output_csv = f"{post_url.split('https://www.instagram.com/')[1].replace('/', '_')}instagram_post_comments.csv"

## Model

### Scrape post comments

In [3]:
# Define the input for the Instagram Comment Scraper actor
input_data = {
    "directUrls": [post_url],
    "resultsType": "comments",
}

# Make a request to start the actor
start_actor_url = f"https://api.apify.com/v2/acts/apify~instagram-comment-scraper/runs?token={apify_token}"
response = requests.post(start_actor_url, json=input_data)
run_details = response.json()

# Extract the run ID
run_id = run_details['data']['id']

# Define the URL to fetch the actor run status
run_status_url = f"https://api.apify.com/v2/acts/apify~instagram-comment-scraper/runs/{run_id}?token={apify_token}"

# Wait for the actor to finish
while True:
    status_response = requests.get(run_status_url)
    status_data = status_response.json()
    if status_data['data']['status'] in ['SUCCEEDED', 'FAILED', 'ABORTED']:
        break
    time.sleep(5)  # Wait for 5 seconds before checking again

if status_data['data']['status'] == 'SUCCEEDED':
    # Define the URL to fetch the results
    dataset_id = status_data['data']['defaultDatasetId']
    dataset_url = f"https://api.apify.com/v2/datasets/{dataset_id}/items?token={apify_token}&format=json"

    # Fetch the comments
    comments_response = requests.get(dataset_url)
    comments_data = comments_response.json()

else:
    print(f"Actor run did not succeed. Status: {status_data['data']['status']}")

### Dataframe structure function

In [4]:
def get_comments(
    cid,
    text,
    username,
    profile_picture,
    timestamp,
    likes_count
):
    return {
        "ID": cid,
        "TEXT": text,
        "USERNAME": username,
        "PROFILE_PICTURE": profile_picture,
        "TIMESTAMP": timestamp,
        "LIKES_COUNT": likes_count
    }

## Output

### Display output

In [5]:
data = []

for comment in comments_data:
    data_comment = get_comments(
            comment["id"],
            comment["text"],
            comment["ownerUsername"],
            comment["ownerProfilePicUrl"],
            comment["timestamp"],
            comment["likesCount"]
        )
    data.append(data_comment)
        
df = pd.DataFrame(data)
df

### Save dataframe to csv

In [6]:
df.to_csv(output_csv, index=False)