Skip to content

Commit

Permalink
Remove Local Variables in biostars_qa Dataset Preprocessing Scripts (#…
Browse files Browse the repository at this point in the history
…2609)

Addresses comment regarding recent pull request:
#2353 (comment)
  • Loading branch information
cannin committed Apr 16, 2023
1 parent 232a877 commit 3d1785f
Showing 1 changed file with 1 addition and 7 deletions.
8 changes: 1 addition & 7 deletions data/datasets/biostars_qa/get_biostars_dataset.py
Expand Up @@ -13,20 +13,14 @@ def get_biostars_dataset(start_idx=9557161, accept_threshold=1000000, sleep=0.1,
Download BioStarts data set from the official API using GET requests
Args:
start_idx (int): The identifier (UID) of the post to retrieve
start_idx (int): The identifier (UID) of the post to retrieve; 9557161 was the last post included in the dataset
accept_threshold (int): stop if this many posts with "has_accepted" true are retrieved
sleep (float): Amount of time to sleep between requests
folder (string): folder to store responses as JSON files
Returns:
Nothing. Content is saved to individual JSON files for each post.
"""

# There is a large number gap in post IDs the numbers skip from 9463943 to 494831
# Post ID: 9557161 was the last post included in the dataset
start_idx = 9557161
accept_threshold = 1000000
sleep = 0.1

headers = {"Content-Type": "application/json"}

has_accepted_count = 0
Expand Down

0 comments on commit 3d1785f

Please sign in to comment.