-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bsky.rss version 2 #25
Conversation
So I ran docker build -t bsky-queue . and this is the tracelog : TSError: ⨯ Unable to compile TypeScript: │ |
Suggestion based on my extensive usage of newsbots over the years on twitter mostly, but also mastodon. User experience is at its best when, along with when queue management and post spacing can be defined, that's the base Maybe it's too complicated to implement, but I think that a formula could be figured that takes into account the size of the queue, the recommended space in between posts and then do some random math to make the time of posting more fuzzy, less predictable. so if the queue is very small add more space in between posts, but if the queue is big, post more regularly? Maybe i'm completely over the board here with this suggestion, just say No and I won't insist 😄 |
after some debugging :
|
so when noUnusedLocals in tsconfig.json option and set it to false the container run without screaming at this variable things apparently get queued :
but then something is undefined ? (the space in between posts?)
This is an extract, I tweaked the last.txt (went back on time) to create a situation where last fetch was an hour ago and it flooded the timeline with some 10+ posts at once, but it did not went over rate limits |
I'm not a typescript/JS dev but with some code-interpreter AI help this is what I got : Assuming you have a function that processes the queue, you could add a delay that depends on the size of the queue. Here's a simple example: async function processQueue() {
while (queue.length > 0) {
const post = queue.shift(); // Get the next post from the queue
await postFunction(post); // Post it
// Calculate delay
const minDelay = 5 * 60 * 1000; // Minimum delay of 5 minutes
const maxDelay = minDelay + queue.length * 1000; // Add 1 second for each remaining post in the queue
const delay = Math.random() * (maxDelay - minDelay) + minDelay; // Random delay between minDelay and maxDelay
// Wait for the delay before continuing with the next post
await new Promise(resolve => setTimeout(resolve, delay));
}
} In this example, You probably had something equal or similar to this in mind already |
Will look into the randomness of posting from queue. As for the error when compiling, I must have left something in the code that I didn't use, I'll remove that. As for the undefined, I believe I left a console.log block in there, so I'll fix that too. |
feedback : Maybe I could be wrong, but the current queue system does not seem to return back to fetching new posts when it's done, the cve bot has not posted anything since 11h CET
|
What's the URL of your feed? I'm working on a feature to allow more field compatibility. |
I just checked at the branch without the queue system and the "no date provided" issue remains, it's not your code, it's the actual NIST feed being a weird RSS https://nvd.nist.gov/feeds/xml/cve/misc/nvd-rss.xml |
when I wrap the NIST feed via Inoreader, then I have a date field : https://www.inoreader.com/stream/user/1005324229/tag/CVE |
I'm testing this branch and I confirm something is strange, it's queuing but it's not posting ? my rss feed is : https://ukraine.osintukraine.com/index.xml this is a good feed to do testing because it updates often and can easy have 50+ updates at a time |
something is still strange here, not sure what : the last post is : @AmplifyUkraine on 2023-08-05 15:41:59 (#514181) 20 min ago but screenshot shows all the subsequent queued posts, but they are not getting posted, while they are being queued ? Something is clear : it's not hitting the rate limit, so the queue is actually working, it's just not clear how much time it waits, and the output isn't telling anymore when it posts, or when is due the next post round |
|
Alright, adapted and rebuilt container in local to run with these changes. I have this error :
so I'm not sure what's causing this |
- Remove duplicates by storing the URL of the RSS articles and prevent publishing the same article twice. - Remote HTML tags and entities from the title.
@rmdes Is that issue still happening or has that been resolved? I can't reproduce these issues so I wonder if it's something on Bluesky's end relating to your IP? |
much rarely lately, sometimes 1 over 20+ bots get into that state but then a docker compose restart and it's all going fine, Other than that, it's all operating really well :) |
Still really odd that that's happening. I don't know an easy way to implement a restart if that happens. If you could give me a bullet point list of the scenario where that error occurs (ways to identify that specific error, e.g. large queue buildup + rate limiting), I can try and make a configurable function that can restart the service when that error happens. |
@rmdes, I think I'm going to get this merged into the main branch soon. As for your issue, if you could still send me that bullet list of the factors that contribute to the large queue build-up, I'll open another PR addressing that. If you don't think this is stable enough for your case at the moment, you can stick to v1 while I get that issue sorted out. I still haven't been able to reproduce it anywhere else. |
I'm not even sure myself why this happens, maybe it's related that my VPS host a few bots and that combined they manage to hit the limits ? |
I'm following you on this, I'll update by bots to run on the main branch v2 |
I'm posting this here, because bots where this happens are still running on the queue branch but some bots now do this kind of error output :
|
That error seems like it's something to do with the tsx runner. I don't know why that would be happening. Which docker image are you using? |
I'm using build 165829101 from the queue branch, package provided by you, not locally build, from last month. This is really new, haven't seen this before last week but only got time to report now.. also it's only happening on 1 or 2 bots and I'm not sure why since nothing changed on my side. |
Can you try |
Trying now..will report back about the tsx issue |
I think it's good now, no more tsx issues |
Good to hear, thanks for the update. |
Going to test this on one of my public bots for ~24 hours and then if all goes well, I should be able to merge this in. Thanks for everyone's contributions! |
Keep having the now infamous and hard to track Post rate limit exceeded bsky.rss POST] Post rate limit exceeded - process will resume after 30 seconds │ The RSS feed for this source is here Granted, this is a particularly busy feed, because it merge a few rss feeds into one (from the main italian media) and then output the mix as a single rss feed, which is then used to feed the bot. It's happening rarely now but it keeps coming up, I do not know if we have enough log coming from blueksy to know more about this, usually a restart would do the trick but now, it keeps piling up items without ever flushing it |
Sorry, I somehow missed this. Would it suffice to add a watch feature where it will check if the queue exceeds a certain amount of items, and if it does, it can restart itself? |
no worries, I think that provided we could set a threshold number in the config for this restart to trigger, I think this could work |
This PR introduces v2 of bsky.rss.
Testing and feedback are greatly appreciated.
Closes #14, #29, #27, #33, #51
If you think other features should be included with this PR, please leave a comment.