Skip to content

minhhungit/github-action-rss-crawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Rss auto crawling using Github Action

Github Action does all these steps automatically, it run rss crawler every 4 hours

Steps:

  • Github will pull repository, build and run crawler code (crawling code is C# (.net core), github will run it directly)
  • Read channel urls from LiteDB
  • Fetch rss feed items
  • Insert feed items into LiteDB after checking blacklist and existing
  • Generate all rss items to static page (index.html - https://minhhungit.github.io/github-action-rss-crawler/ )
  • Commit change (litedb database & index.html page) and push to this repo

Workflow

on:
  schedule:
    # Runs every 4h
    - cron: '0 */4 * * *'
  workflow_dispatch:
  
jobs:
  update-readme-with-blog:
    name: Crawl rss and generate static page
    runs-on: windows-2019
    steps:
      - uses: actions/checkout@main
        with:
          repository: minhhungit/github-action-rss-crawler
          token: ${{ secrets.GITHUB_TOKEN }}
      - uses: actions/setup-dotnet@v1
        with:
          dotnet-version: 3.1.x
      #- run: dotnet build DemoApp\DemoApp.sln      
      - run: dotnet run --project RssCrawler\RssCrawler.csproj
      - run: git config --local user.email "it.minhhung@gmail.com"
      - run: git config --local user.name "Jin"
      - run: git add .
      - run: git commit -m "Add changes"
      - run: git push

Demo

https://minhhungit.github.io/github-action-rss-crawler/

Donate ^^

If you like my works and would like to support then you can buy me a coffee ☕️ anytime

Buy Me a Coffee at ko-fi.com

I would appreciate it!!!