Skip to content

Do everything from data collection from reddit to training a machine learning model in just two lines of python code!

License

Notifications You must be signed in to change notification settings

nfflow/redditflow

Repository files navigation

Redditflow.

**Do everything from data collection from reddit to training a machine learning model in just two lines of python code! **


WebsiteInstallationDocsHuggingface HubBlog

PyPI Status Downloads Build Status Discord license

Supports:

  • Text Data
  • Image Data

Execution is as simple as this:

  • Make a config file with your required details of input.
  • Run the API in a single line with the config passed as input.

Installation.

pip install redditflow

Latest installation from source.

pip install git+https://github.com/nfflow/redditflow

Examples

Text data collection and training a model in the end.

from redditflow import TextApi


config = {
        "sort_by": "top",
        "subreddit_text_limit": 50,
        "total_limit": 200,
        "start_time": "27.03.2021 11:38:42",
        "end_time": "27.03.2022 11:38:42",
        "subreddit_search_term": "healthcare",
        "subreddit_object_type": "comment",
        "ml_pipeline": {
            'model_name': 'distilbert-base-uncased',
            'model_output_path': 'healthcare_27.03.2021-27.03.2022_redditflow',
            'model_architecture': 'CT'
            }
    }


TextApi(config)


Image data collection

from redditflow import ImageApi


config = {
        "sort_by": "top",
        "subreddit_image_limit": 3,
        "total_limit": 10,
        "start_time": "13.11.2021 09:38:42",
        "end_time": "15.11.2021 11:38:42",
        "subreddit_search_term": "cats",
        "subreddit_object_type": "comment",
        "client_id": "$CLIENT_ID",  # get client id for praw
        "client_secret": '$CLIENT_SECRET',  # get client secret for praw
         }

ImageApi(config)


Since the image api requires praw api from python, a praw client_id and client_secret are required. Read here about how to get client id and client secret for praw.

Citation.

If you use our work, please cite the software in the url: https://github.com/nfflow/redditflow

Launching nfflow Rewards

Contributed to nfflow? Here is a big thank you from our community to you. Claim your badge and showcase them with pride. Let us inspire more folks !

nfflow Badges

About

Do everything from data collection from reddit to training a machine learning model in just two lines of python code!

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Packages

No packages published

Languages