Skip to content

nazarovsa/TPost

Repository files navigation

TPost

Build & test

TPost is a starter project for an app that can crawl websites and make posts in different sources, for example, It can use a telegram bot to publish posts into the telegram channel.

How to start?

In progress...

How it works?

In progress...

IPostCrawler

Crawler extracts required content from a web page and returns posts.
Interface for crawlers, it has single method GetPosts, which returns IReadOnlyCollection<IPost>.
For example, you can implement crawler using HtmlAgilityPack.

IPostCrawlerManager

Crawler manager takes all crawlers from DI container and refill store with posts. Default implementations is: DefaultPostCrawlerManager. DefaultPostCrawlerManager takes count / n posts from each registered crawler. Where count is argument of method RenewPostStore (10 by default) and n is amount of registered in DI IPostCrawler's.

IPostStore

Post store stores crawled posts.
Default implemetation is: ConcurrentQueuePostStore.
GetAndRemoveOne method takes one post from a store and removes it. If you want to use the persistent store, and want to save published posts. You should implement a mechanism, that will not return already published posts. (Extra column in a table like IsPublished, etc.)

IPostPublisher

Interface for post publisher.
Default implementation is CompositePostPuiblisher, which takes all IPostPublisherTransport from DI container and calls Publish method on them.

IPostPublisherTransport

Publisher transport. Inherits from IPostPublisher. Added to make CompositePostPuiblisher work. By default behavior, you need to implement just that interface. It will allow you to have multiple publish destinations.

PostJob

Post job takes post from the store and publishes it to registered IPostPublisher by cron schedule.

Publishers

Publishers used to publish posts to different destinations.
If you want to use a custom publisher, implement the IPostPublisherTransport interface. In the default implementation, it will be used by CompositePostPublisher, but you can register it as a single publisher by removing registration for IPostPublisher из IServiceCollection which occurs at TPostHostFactory.

Sample

You can find working sample in _Samples/TPost.Host.Sample folder. This sample crawling jokes from two sites and publishing them into a console. As you can see, it just implements two crawlers and registers them in a host.
I have a real project, which works like that, and it posts jokes into @cringedot Telegram channel.

List of Packages

Package Name Version Downloads Description
TPost.Core nuget version Nuget Core abstractions
TPost.Hosting nuget version Nuget Hosting abstractions
TPost.Publishers.Telegram nuget version Nuget Telegram bot publisher implementation

About

Dotnet starter for an app that crawls websites and publishes posts.

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published