You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: about.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -31,4 +31,4 @@ For insights into our ongoing development, project roadmap, and how you can get
31
31
32
32
`html2rss` is maintained by a dedicated group of volunteers and contributors from around the world. We are passionate about open source and committed to continuously improving the project.
33
33
34
-
Want to join us? Check out our [Contributing Guide]({{ '/contributing' | relative_url }})!
34
+
Want to join us? Check out our [Contributing Guide]({{ '/get-involved/contributing' | relative_url }})!
Copy file name to clipboardExpand all lines: html2rss-configs/index.md
+46-32Lines changed: 46 additions & 32 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,21 +5,29 @@ has_children: false
5
5
nav_order: 5
6
6
---
7
7
8
-
# Creating Feed Configurations
8
+
# Creating Custom RSS Feeds
9
9
10
-
Welcome to the guide for `html2rss-configs`. This document explains how to create your own configuration files to convert any website into an RSS feed.
10
+
Want to create RSS feeds for websites that don't offer them? This guide shows you how to write simple configuration files that tell the html2rss engine exactly what content to extract.
11
11
12
-
You can find a list of all community-contributed configurations in the [Feed Directory]({{ '/feed-directory/' | relative_url }}).
12
+
**Don't worry if you're not technical** - we'll explain everything step by step!
13
+
14
+
You can see examples of what others have created in the [Feed Directory]({{ '/feed-directory/' | relative_url }}).
13
15
14
16
---
15
17
16
-
## Core Concepts
18
+
## How It Works
19
+
20
+
Think of the html2rss engine as a smart assistant that needs instructions. You give it a simple "recipe" (called a config file) that tells it:
21
+
22
+
1.**Which website** to look at
23
+
2.**What content** to find (articles, posts, etc.)
24
+
3.**How to organize** that content into an RSS feed
17
25
18
-
An `html2rss` config is a YAML file that defines how to extract data from a web page. It consists of two main building blocks: `channel` and `selectors`.
26
+
The recipe is written in YAML - a simple format that's easy to read and write. Both html2rss-web and the html2rss Ruby gem use these same configuration files.
19
27
20
28
### The `channel` Block
21
29
22
-
The `channel` block contains metadata about the RSS feed itself, such as its title and the source URL.
30
+
This tells the html2rss engine basic information about your feed - like giving it a name and telling it which website to look at.
23
31
24
32
**Example:**
25
33
@@ -29,11 +37,11 @@ channel:
29
37
title: My Awesome Blog
30
38
```
31
39
32
-
For a complete list of all available channel options, please see the [Channel Reference]({{ '/ruby-gem/reference/channel/' | relative_url }}).
40
+
This says: "Look at this website and call the feed 'My Awesome Blog'"
33
41
34
42
### The `selectors` Block
35
43
36
-
The `selectors` block is the core of the configuration, defining the rules for extracting content. It always contains an `items` selector to identify the list of articles and individual selectors for the data points within each item (e.g., `title`, `link`).
44
+
This is where you tell the html2rss engine exactly what to find on the page. You use CSS selectors (like you might use in web design) to point to specific parts of the webpage.
37
45
38
46
**Example:**
39
47
@@ -47,17 +55,19 @@ selectors:
47
55
selector: "h2 a"
48
56
```
49
57
50
-
For a comprehensive guide on all available selectors, extractors, and post-processors, please see the [Selectors Reference]({{ '/ruby-gem/reference/selectors/' | relative_url }}).
58
+
This says: "Find each article, get the title from the h2 link, and get the link from the same h2 link"
59
+
60
+
**Need more details?** Check our [complete guide to selectors]({{ '/ruby-gem/reference/selectors/' | relative_url }}) for all the options.
51
61
52
62
---
53
63
54
-
## Tutorial: Your First Config
64
+
## Tutorial: Your First Feed
55
65
56
-
This tutorial walks you through creating a basic configuration file from scratch.
66
+
Let's create a simple RSS feed step by step. We'll use a basic blog as our example.
57
67
58
-
### Step 1: Identify the Target Content
68
+
### Step 1: Look at the Website
59
69
60
-
First, identify the HTML structure of the website you want to create a feed for. For this example, we'll use a simple blog structure:
70
+
First, visit the website you want to create a feed for. Right-click and "View Page Source" to see the HTML structure. Look for patterns like this:
61
71
62
72
```html
63
73
<div class="posts">
@@ -72,9 +82,11 @@ First, identify the HTML structure of the website you want to create a feed for.
72
82
</div>
73
83
```
74
84
75
-
### Step 2: Create the Config File and Define the Channel
85
+
**What we see:** Each article is wrapped in `<article class="post">`, titles are in `<h2><a>` tags, and descriptions are in `<p>` tags.
86
+
87
+
### Step 2: Create Your Config File
76
88
77
-
Create a new YAML file (e.g., `my-blog.yml`) and define the `channel`:
89
+
Create a new text file and save it as `my-blog.yml` (or any name you like). Add this basic information:
78
90
79
91
```yaml
80
92
# my-blog.yml
@@ -84,9 +96,11 @@ channel:
84
96
description: The latest news from my awesome blog.
85
97
```
86
98
87
-
### Step 3: Define the Selectors
99
+
This tells html2rss: "Look at this website and call the feed 'My Awesome Blog'"
88
100
89
-
Next, add the `selectors` block to extract the content for each post.
101
+
### Step 3: Tell html2rss What to Find
102
+
103
+
Now add the selectors that tell html2rss exactly what content to extract:
90
104
91
105
```yaml
92
106
# my-blog.yml
@@ -101,26 +115,17 @@ selectors:
101
115
selector: "p"
102
116
```
103
117
104
-
- `items`: This CSS selector identifies the container for each article.
105
-
- `title`, `link`, `description`: These selectors target the specific data points within each item. For a `link` selector, `html2rss` defaults to extracting the `href` attribute from the matched `<a>` tag.
118
+
**What this means:**
119
+
120
+
- `items: "article.post"` = "Find each article with class 'post'"
121
+
- `title: "h2 a"` = "Get the title from the h2 link"
122
+
- `link: "h2 a"` = "Get the link from the same h2 link"
123
+
- `description: "p"` = "Get the description from the paragraph"
106
124
107
125
---
108
126
109
127
## Advanced Techniques
110
128
111
-
### Handling Pagination
112
-
113
-
To aggregate content from multiple pages, use the `pagination` option within the `items` selector.
114
-
115
-
```yaml
116
-
selectors:
117
-
items:
118
-
selector: ".post-listing .post"
119
-
pagination:
120
-
selector: ".pagination .next-page"
121
-
limit: 5 # Optional: sets the maximum number of pages to follow
122
-
```
123
-
124
129
### Dynamic Feeds with Parameters
125
130
126
131
Use the `parameters` block to create flexible configs. This is useful for feeds based on search terms, categories, or regions.
Copy file name to clipboardExpand all lines: index.md
+20-15Lines changed: 20 additions & 15 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,30 +4,35 @@ title: Home
4
4
nav_order: 1
5
5
---
6
6
7
-
# Create RSS Feeds for Any Website
7
+
# Turn Any Website Into an RSS Feed
8
8
9
-
`html2rss` creates RSS feeds for any website.
10
-
[**🚀 Get Started with the Web App**]({{ '/web-application/getting-started' | relative_url }})
9
+
Ever wished you could follow your favorite websites like a social media feed? The html2rss project makes it possible by creating RSS feeds for any website - even ones that don't offer them.
10
+
11
+
[**🚀 Get Started with html2rss-web**]({{ '/web-application/getting-started' | relative_url }})
-**Precise Content Extraction:** Use CSS selectors for targeted content inclusion.
18
-
-**JavaScript Rendering:** A headless browser renders JavaScript-heavy sites for comprehensive content extraction.
19
-
-**Open Source:**`html2rss` is free to use, modify, and contribute.
17
+
RSS (Really Simple Syndication) lets you follow websites in your favorite feed reader. Instead of checking multiple websites daily, you get all updates in one place - like a personalized news feed.
20
18
21
-
---
19
+
## The html2rss Project
20
+
21
+
The html2rss project provides two main ways to create RSS feeds:
22
22
23
-
## The html2rss Ecosystem
23
+
-**html2rss-web** - A user-friendly web application (recommended for most users)
24
+
-**html2rss** - A Ruby gem for developers and advanced users
25
+
26
+
Both use the same powerful engine to extract content from websites and convert it into RSS feeds.
27
+
28
+
---
24
29
25
-
The `html2rss` project offers a complete RSS solution through a collection of integrated tools:
30
+
## Choose Your Path
26
31
27
-
-**[html2rss-web]({{ '/web-application' | relative_url }}):**User-friendly web application to create, manage, and share RSS feeds. Recommended starting point.
28
-
-**[html2rss (Ruby Gem)]({{ '/ruby-gem' | relative_url }}):**Core library and command-line interface for developers.
-**[html2rss-web]({{ '/web-application' | relative_url }}):****Start here!** Easy-to-use web application. No technical knowledge required.
33
+
-**[Feed Directory]({{ '/feed-directory' | relative_url }}):**Browse ready-made feeds for popular websites
34
+
-**[html2rss (Ruby Gem)]({{ '/ruby-gem' | relative_url }}):**For developers who want to create custom configurations
30
35
31
36
---
32
37
33
-
Engage with the `html2rss` community or contribute. Visit our [Get Involved]({{ '/get-involved' | relative_url }}) page.
38
+
**Ready to get started?** Check out our [html2rss-web getting started guide]({{ '/web-application/getting-started' | relative_url }}) or [browse existing feeds]({{ '/feed-directory' | relative_url }}) to see what's possible.
Copy file name to clipboardExpand all lines: ruby-gem/installation.md
+5-7Lines changed: 5 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -13,7 +13,7 @@ This guide will walk you through the process of installing html2rss on your syst
13
13
14
14
### Prerequisites
15
15
16
-
-**Ruby:** html2rss is built with Ruby. Ensure you have Ruby installed (version 3.3 or higher recommended). You can check your Ruby version by running `ruby -v` in your terminal. If you don't have Ruby, visit [ruby-lang.org](https://www.ruby-lang.org/en/documentation/installation/) for installation instructions.
16
+
-**Ruby:** html2rss is built with Ruby. Ensure you have Ruby installed (version 3.2 or higher required). You can check your Ruby version by running `ruby -v` in your terminal. If you don't have Ruby, visit [ruby-lang.org](https://www.ruby-lang.org/en/documentation/installation/) for installation instructions.
17
17
-**Bundler (Recommended):** Bundler is a Ruby gem that manages your application's dependencies. It's highly recommended for a smooth installation. Install it with `gem install bundler`.
18
18
19
19
---
@@ -43,15 +43,13 @@ Then, run `bundle install` in your project directory.
For a more isolated and reproducible environment, you can use the official html2rss Docker image.
48
+
For a quick start without local setup, you can develop html2rss directly in your browser using GitHub Codespaces:
49
49
50
-
```bash
51
-
docker pull html2rss/html2rss
52
-
```
50
+
[](https://github.com/codespaces/new?repo=html2rss/html2rss)
53
51
54
-
You can then run html2rss commands within a Docker container. Refer to the [Docker Hub page](https://hub.docker.com/r/html2rss/html2rss) for detailed usage.
52
+
The Codespace comes pre-configured with Ruby 3.4, all dependencies, and VS Code extensions ready to go!
0 commit comments