Fixes to username status processing#7
Conversation
Changed: `parse_watchlist` Uses space separating user status and user name to get user. `parse_user_page` Users URL in meta tag to get user name. However, the user name from the meta tag url is lower case. Thus a lenght comparison is used to determine if status symbol is present. Added: `parese_username_from_url` Takes url and returns username under the assumption that the last part of the url is the username What is still needed: Adjustment of `parse_user_tag` function is still needed.
|
I hope this is good? |
|
I actually didn't mean too merge it yet, damn GitHub interface x3 It fails static type tests (because of missing type annotations), but the unit tests succeeded :) I'll add the types and do some more thorough testing for edge cases, but it looks good! |
|
Well, so far it seems to run very smoothly. |
|
I changed the watchlist parser to check the actual link, it contains the username without the status. It then removes the links and then just uses the remaining text stripping it of whitespaces. I really like using the meta tag for the username, but unfortunately FA removes some underscore characters from URLs, so for example a user called "user_name" would have a url of "username", thus failing the equal length check :( Not really sure why they specifically remove underscores, though I suspect it has to do with SQL queries (SQL like uses |
|
Changed it to this: status: str = ""
name: str = tag_status.text.strip()
if username_url(name) != username_url(parse_username_from_url(tag_meta_url["content"])):
status, name = name[0], name[1:]
I'm currently trying to come up with test cases to make sure it will actually work. I usually don't do these kind of comparisons because they evaluate to true in some weird cases. |
|
It's all integrated now, thank you for sharing your code! |
|
I'll just wait bit before publishing a new release though, wouldn't want to forget something, again x3 |
|
Oh! Frankly, I haven't noticed that the URLs behaved like this. Of course the length check will fail that way. Thankfully, the way it the check worked only allows 1 case, were it will give a faulty result. Lucky me and my still running code... |
|
I found the edge case. There is a bug in FA's template that does not add the |
|
I've restored the admin image check for folders, it's still there for user pages. Sorry it didn't work :( |
|
I mean... I already knew that it wasn't in the template for This is why I didn't implement it and said it still needed to be done (some way). Nonetheless, one could extract the username from one of the other paths of one of the tags on the journals site still 🤔 |
|
Still, it was a really good method and I hoped to use it for all pages. Alas the journals page is missing almost all og meta tags :( <meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta name="description" content="Fur Affinity | For all things fluff, scaled, and feathered!">
<meta name="keywords" content="fur furry furries fursuit fursuits cosplay brony bronies zootopia scalies kemono anthro anthropormophic art online gallery portfolio">
<meta name="distribution" content="global">
<meta name="copyright" content="Frost Dragon Art LLC">
<meta http-equiv="X-UA-Compatible" content="IE=9; IE=EDGE">
<meta property="og:image" content="https://www.furaffinity.net/themes/beta/img/banners/fa_logo.png?v2">
<meta name="twitter:image" content="https://www.furaffinity.net/themes/beta/img/banners/fa_logo.png?v2">It could be changed to find the link to the journals page itself, using |
Changed:
parse_watchlistUses space separating user status and user name to get user.parse_user_pageUsers URL in meta tag to get user name. However, the user name from the meta tag url is lower case. Thus a lenght comparison is used to determine if status symbol is present.Added:
parese_username_from_urlTakes url and returns username under the assumption that the last part of the url is the usernameWhat is still needed:
Adjustment of
parse_user_tagfunction is still needed.