Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author field shows HTML tags when template used #5030

Open
pigsonthewing opened this issue Aug 7, 2022 · 12 comments · May be fixed by #5069
Open

Author field shows HTML tags when template used #5030

pigsonthewing opened this issue Aug 7, 2022 · 12 comments · May be fixed by #5069
Labels

Comments

@pigsonthewing
Copy link

Summary

When viewing my recent uploads in the app (using Beta version=4.0.3~23c474b06), the attribution includes visible HTML markup and incorrect text. This is apparently caused when {{Creator:Andy Mabbett}} is present on the Commons file page, as it includes the description from the associated Wikidata item. For items with the author in plain text, the issue does not occur.

Steps to reproduce

As above

Expected behaviour

Correct attribution should be shown.

Actual behaviour

As above

Device name

No response

Android version

Android 11

Commons app version

4.0.3~23c474b06

Device logs

No response

Screen-shots

https://commons.wikimedia.org/wiki/File:Commons_app_beta_screenshot_showing_attrbution_text_errors_-_2022-08-07.jpg

Would you like to work on the issue?

Prefer not

@pigsonthewing pigsonthewing changed the title [Bug]: Incorrect ttrbution shown when Template:Creator is present [Bug]: Incorrect attribution shown when Template:Creator is present Aug 7, 2022
@whym
Copy link
Collaborator

whym commented Aug 9, 2022

What would be the correct attribution to show in this case? Should the app try to convert the content of {{Creator:Andy Mabbett}} to (short) plain text somehow?

@whym
Copy link
Collaborator

whym commented Aug 9, 2022

I tested it and it happened with https://commons.wikimedia.org/wiki/File:Zhu-Zhanji-Gibbons-at-Play.jpg . In this case, the creator is not me. (I uploaded a public domain artwork.)

Screenshot_20220809-173537

@pigsonthewing
Copy link
Author

What would be the correct attribution to show in this case? Should the app try to convert the content of {{Creator:Andy Mabbett}} to (short) plain text somehow?

I think that the template's whole text would be too much text. The simplest solution may be to render the name of the individual template without the "Creator:" prefix; so in the first case above, "Andy Mabbett", and for the second (your) example, "Xuande".

Alternatives for your example would be to use the top line's link text ("Xuande Emperor") or the entire top line's text ("Xuande Emperor  (1399–1435)")

@nicolas-raoul nicolas-raoul changed the title [Bug]: Incorrect attribution shown when Template:Creator is present Author field shows HTML tags when template used Sep 19, 2022
DamienBradleyDSP added a commit to DamienBradleyDSP/apps-android-commons that referenced this issue Oct 11, 2022
…used"

Added specific display author function in Media.kt that parses the author string with Regex
DamienBradleyDSP added a commit to DamienBradleyDSP/apps-android-commons that referenced this issue Oct 11, 2022
…used"

Added specific display author function in Media.kt that parses the author string with Regex
@nicolas-raoul
Copy link
Member

The fix by Damien and Shankar seems to work with all of the users I have tried (Andy and many seen in the recent featured images).

@pigsonthewing If you know happen to know users with other patterns or edge case author names, please let us know. :-)

adb

@nicolas-raoul
Copy link
Member

I wonder what we should do in this case:

https://commons.wikimedia.org/wiki/File:Mastl%C3%A9_Odles_Stevia.jpg

|Author={{User:Moroder/Template:Credits|Photos by Moroder for natural monuments in South Tyrol}}

Screenshot 2023-03-24 at 16 43 47

Current master:
adb

Pull request:
adb2

Would it be acceptable to give up and just display the username, which is more simple?

@shankarpriyank
Copy link
Contributor

|Author={{User:Moroder/Template:Credits|Photos by Moroder for natural monuments in South Tyrol}}

Hey @nicolas-raoul are the other edge cases in the same format as this?
If other edge cases are also in this format, maybe we can write a separate parsing logic for such type of case, but according to my comprehension, that would be really challenging(but possible).

@RitikaPahwa4444
Copy link
Collaborator

I tried parsing the author name using the official API and I think the current master is displaying the HTML text for the camera icon(The very first icon that appears in the author field here). This may be related to #5075, wherein it is including the edit icon too in the description. The reason why the pull request does not show it is because it returns " " in case of multiple HTML tags. Parsing it directly would still give a link to the camera icon.

@whym
Copy link
Collaborator

whym commented Mar 25, 2023

I browsed the featured pictures section of the app, and I found another example.
https://commons.wikimedia.org/wiki/File%3AHuman_karyotype_with_bands_and_sub-bands.png
0Screenshot_20230325-162102

Another file with non-obvious author information (found not using the app) - I'm not sure if a user could encounter this using the app currently, but I think it's good to support it, if possible.
https://commons.wikimedia.org/wiki/File:Brut%27s_Chronicle_of_England_and_the_Destruction_of_Jherusalem,_Vindicta_Salvatoris_in_English_-_DPLA_-_9beee1bd269ee599f88298c7583adc43_(page_64).jpg

@whym
Copy link
Collaborator

whym commented Mar 25, 2023

In the case of File:Mastlé_Odles_Stevia.jpg, the file's structured data has "Wolfgang Moroder" as the value of "author name string" property (P2093). It seems like using it would be a clean approach. I don't know how much percentage of the files has it (and has the correct value), though.

@nicolas-raoul
Copy link
Member

Structured Data sounds very promising, aclowing us to avoid the HTML parsing issue altogether.

Sometimes creator is an item, in such cases we can take its caption: https://commons.wikimedia.org/wiki/File:Statue_of_Sir_John_Dinham.jpg

@nicolas-raoul
Copy link
Member

Out of 10 random files only one had no author information: https://commons.wikimedia.org/wiki/File:Van_Houten_Cacao_top.JPG

I would be in favor of using this Structured Data by default, and if not present do the HTML parsing.

@pigsonthewing
Copy link
Author

The original issue is not fixed, as seen in my recent uploads (cropped screenshot attached)
2023-12-20 20 37 33

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
5 participants