-
-
Notifications
You must be signed in to change notification settings - Fork 951
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Plurk - More granular control #1111
Comments
I'd also note that plurk URLs are processed by DirectlinkExtractor, so all of them (by default) are in the directlink folder and not grouped by users. Which also makes it harder to filter other non-plurk links like twitter etc, even with whitelist/blacklist. UPD: Actually nvm, I understood why it is like that. Still, more granular control is indeed needed. |
Your update means that it is working as expected for you now? |
No, and I'm not the author of this issue. Filters described by the author would be really helpful. Also, I guess I rethought my upd. I think handling native plurk links with plurk extractor is not that bad idea. |
Yes, the bigger issue with this is that the Plurk extractor currently does not provide any filename or directory keywords. So if you want to customize your filename or extract metadata, it's not possible right now. |
@mikf By the way, I tried to implement temporary solution to the directory issue, and this is what I figured out:
As was said, plurk's images are transfered to directlink extractor for some reasons, so by default the result will be:
Unless I misunderstand how
But instead it looks like this:
So, the directory has changed, but it is the default plurk's directory name, not from the config. Meanwhile filename hasn't changed at all and has remained the default for directlink extractor. Is it intended that Also, setting And other question: why is plurk extractor considered manga chapter extractor? |
@mikf still doesn't seem to work properly. |
Plurk extractor lacks the granular control that other Twitter-like extractors provides.
Some options that I think would be useful:
From what I gather there's a few possible types of content within a plurk. Being able to control which kind of content you want to get would be useful:
Images uploaded directly to plurk. Right now the extractor seems to treat it as simple direct links without any filename customization support.
Don't know if anyone would find this useful though.
Links to other plurks.
Links to external sites like YouTube, Imgur, etc.
Just a normal post, equivalent to a tweet.
It seems to be the equivalent to retweets. Filtering these out would be useful if you want to download content only from a single user profile.
There doesn't seem to be any equivalent for quoted tweets yet.
You can like plurks, but your likes are not publicly displayed like in Twitter from what I can see. Maybe the API supports it?
Not sure what this does. Would appreciate it if anyone can clarify.
It also would be nice if you can choose to only process comments made by the original poster of the plurk. If the API somehow allows you to request comments that a user has made in plurks from other profiles, that would be a neat thing to include too.
Not sure if scraping R18 content requires authentication or not.
If anyone thinks something is missing here, please do tell in the comments.
The text was updated successfully, but these errors were encountered: