Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pixiv downloading "Work cannot be displayed" image #4327

Open
501stRookie opened this issue Jul 20, 2023 · 20 comments
Open

Pixiv downloading "Work cannot be displayed" image #4327

501stRookie opened this issue Jul 20, 2023 · 20 comments

Comments

@501stRookie
Copy link

Starting today, when I tried to download an image from Pixiv instead of downloading the image, it instead downloads this image with Japanese text that says "This work cannot be displayed".

It seems to only happen on posts that were recently posted, as images that were uploaded yesterday and older download fine.
110073090_0

@thatfuckingbird
Copy link
Contributor

thatfuckingbird commented Jul 20, 2023

Can confirm, seeing this too. Refreshing the login token doesn't help and it doesn't seem to be related to whether the image is r18 or not.
The URL I get for this image is https://s.pximg.net/common/images/limit_sanity_level_360.png
The metadata json seems to be mostly fine, but it also contains a "sanity_level": 4, entry.

Update:
Tried PixivUtil2, it doesn't have this problem.

Update2:
Tried the official Pixiv app, these images also do not show up there. The last displayed images (if I short by "newest") are from yesterday. Except ugoira, those show for some reason.

Also might be related issue here: upbit/pixivpy#275

Update3:

If you check https://www.pixiv.net/info.php?cid=1&lang=en there are announcements about the suspension and reinstantiation of their mobile apps from app stores, related to content in the apps. I think they might be doing some kind of semi-manual filtering now which causes this lag between the mobile app API and the website.
This might mean we can no longer use that API, at least for downloading the images themselves.
Also for the future, it might be a good idea to detect that limit_sanity_level placeholder image and error on it.

Update4:

The metadata is incomplete too for these images (no tags).

@Slider-Whistle
Copy link

Doesn't appear to be happening on my end.

@kattjevfel
Copy link
Contributor

I ran into this but I can no longer reproduce it, so must've been something temporary.

@thatfuckingbird
Copy link
Contributor

thatfuckingbird commented Jul 20, 2023

It is still happening (as I'm writing this), but the lag between what is available on the mobile app API and what's visible on the site has decreased. Currently I see about a 8-10 minute lag until an image shows up on the mobile app (looking at the posting times). You can reproduce this if you go to the site, search for some very common tag like "illustration" and try to download the newest entry chronologically (check if it was posted in the last few mins).

The other question is, is there any content that won't be available on the mobile API at all? I haven't encountered anything like that yet but since this whole thing might be because the mobile app does some additional filtering due to appstore requirements then it can't be discounted.

For now I think a temporary solution would be to catch these cases when an invalid image is returned (easy from the URL) and either error or try to wait 5-10-15 mins like in the case of rate limits. If the lag between the site and image availability in the API remains low then this might be enough, maybe along with some informational message in these cases.

Ultimately if the time lag between the mobile API and the site keeps randomly increasing/decreasing or the mobile API becomes filtered in some other way then a switch to the non-mobile API (the one the website uses) might be needed.

@Dartkun
Copy link

Dartkun commented Jul 21, 2023

Still happening on my end. Also only for basically brand new pictures.

@alleneko
Copy link

Happens for me when I'm downloading from my bookmarks, but doesn't happen when I use the search page or individual posts.

@mikf
Copy link
Owner

mikf commented Jul 22, 2023

https://s.pximg.net/common/images/limit_sanity_level_360.png images now get ignored
(a45a17d (yes, that's the wrong issue number ...))

To manually ignore them, enable url-metadata and --filter them that way.

The other question is, is there any content that won't be available on the mobile API at all?

I've noticed that search results, and only those, do not include R-18G works.

@thatfuckingbird
Copy link
Contributor

I've noticed that search results, and only those, do not include R-18G works.

Works for me, it might be your account settings (there is a separate toggle for r18g iirc)

@mikf
Copy link
Owner

mikf commented Jul 31, 2023

These settings are enabled for all of my accounts.
It is working again, but it definitely wasn't when I posted #4327 (comment).

Are these "Work cannot be displayed" images still a thing or did Pixiv somehow fix whatever these were meant for?

(I've never encountered one of these or a "Skipping 'sanity_level' warning" logging message myself)

@thatfuckingbird
Copy link
Contributor

There's still at least a few minutes of lag before images displayed on the website also appear in the app, so if you happen to download very recent image URLs those will still produce the sanity_level image. I think we will just have to live with this for the time being, since it probably not worth a rewrite to switch to the API that the website uses.

@mikf
Copy link
Owner

mikf commented Jul 31, 2023

Yeah, I'd really want to avoid using the website API if at all possible. It is a lot slower, requires an extra request for each individual post, and, more importantly, would need exported cookies for authentication, which expire in a month or so.

I did try to rewrite the current extractor back when auth with username and password got disabled and it wasn't a "pleasant" experience, to say the least.

@AlttiRi
Copy link

AlttiRi commented Aug 23, 2023

Just a short resume of #4421 (comment)

In Pixiv's Android application, and therefore in gallery-dl too:

  • There are shadow banned images, they are hidden from profiles entirely (/v1/user/illusts?user_id= just does not return any information for them), but they are still listed in bookmarks (/v1/user/bookmarks/illust?user_id=) with a dummy thumbnail in the app. Can't be downloaded even with an artwork url (/v1/illust/detail?illust_id=). The endpoint returns a trimmed response with "visible": false.
  • There are "soft" shadow banned images. As usual images, but only with the missed caption (description).

@AlttiRi
Copy link

AlttiRi commented Sep 21, 2023

Seems, the caption "bug" is "fixed".
But some images are still with "visible": False, gallery-dl does not see them when it downloads a profile's images.


Upd 2023.10.08: The "bug" was returned.

@thatfuckingbird
Copy link
Contributor

Encountered another image that won't download (giving skip sanity_level warning in the log),
(NSFW warning) https://www.pixiv.net/en/artworks/109487939 . Interesting because none of the artist's other works seem to be affected and by pixiv standards it's rather tame too.

@Sherman-Liu
Copy link

Same issue here: #4760 (comment)

@akinokonomi
Copy link

Seeing Skipping 'sanity_level' warning too.
Not nsfw https://www.pixiv.net/en/artworks/102932581

gallery-dl seems to silently skip it, maybe add more explicit error/warning?

I only noticed it was being skipped, after passing --verbose argument.

@thatfuckingbird
Copy link
Contributor

Might be a good idea to add these post URLs to the output of --write-unsupported.

@espressoelf
Copy link

The best solution would be falling back to a secondary extractor that doesn't use Pixiv's mobile API. It's like @\thatfuckingbird pointed out: Pixiv is taking measures to keep their mobile apps in the stores. Unfortunately, the automatic flagging is rather triggerhappy, producing many false positives. There also seems to be no publicly visible indicator or any way to appeal the flag from what I saw, so finding a way around is very important for every data hoarder.

@AlttiRi
Copy link

AlttiRi commented Feb 24, 2024

I think it makes sense to add a support to use web API additionally to the Android app's API.

Since mobile API does not return shadow banned artworks it would require to use an extra call to get all artworks IDs with site's API:

Object.keys((await (await fetch("https://www.pixiv.net/ajax/user/1657441/profile/all?lang=en")).json()).body.illusts)

So, you can find the missed artworks.

To get the info for them:

(await (await fetch("https://www.pixiv.net/ajax/illust/113897896?lang=en")).json()).body

For ugoira, also:

(await (await fetch("https://www.pixiv.net/ajax/illust/113897896/ugoira_meta?lang=en")).json()).body

However, it seems it's not possible to detect when the caption is removed (in app API) due to a soft shadow ban, or just the author did not add it.

For example: https://www.pixiv.net/en/artworks/103983466 is visible, but it have no caption. "Soft shadow banned".

While these https://www.pixiv.net/en/artworks/102932581, https://www.pixiv.net/en/artworks/109211067 are additionally hidden from the profiles. Can't be downloaded with gallery-dl now (it returns response with visible: False). "Shadow banned".

So, it needs to use the site's API each time when caption is empty, even while the artwork is not shadow banned, if you need the description for meta files.


Also, site's API returns description with links are wrapped into <a href="/jump.php?....
There is extraData.meta.twitter.description.


JS code to collect all infos from https://www.pixiv.net/en/users/123456 page:

const headers = {
    // "user-agent": `...`,
    // "cookie": `...`,
};

const profileId = document.location.pathname.match(/(?<=users\/)\d+/)[0]; // https://www.pixiv.net/en/users/7386235

const ids = Object.keys((await (await fetch(`https://www.pixiv.net/ajax/user/${profileId}/profile/all?lang=en`, {
    headers: {
        "referer": `https://www.pixiv.net/en/users/${profileId}`,
        ...headers
    }
})).json()).body.illusts);

const json = {};
for (const id of ids) {
    const body = (await (await fetch(`https://www.pixiv.net/ajax/illust/${id}?lang=en`, {
        headers: {
            "referer": `https://www.pixiv.net/en/artworks/${id}`,
            ...headers,
        }
    })).json()).body;
    json[id] = body;
}

downloadBlob(new Blob([JSON.stringify(json, null, " ")]), `[pixiv][json] ${profileId}${json[ids[0]]?.userName} (${ids.length}).json`, document.location);

function downloadBlob(blob, name, url) {
    const anchor = document.createElement("a");
    anchor.setAttribute("download", name || "");
    const blobUrl = URL.createObjectURL(blob);
    anchor.href = blobUrl + (url ? ("#" + url) : "");
    anchor.click();
    setTimeout(() => URL.revokeObjectURL(blobUrl), 3000);
}

@AlttiRi
Copy link

AlttiRi commented Feb 24, 2024

It is a lot slower, requires an extra request for each individual post.

Optional mixed mode:

  • Use web API for profile urls (/users/) to check the existence of missed artworks and downloading of them if they exist,
  • For direct urls (/artworks/) when visible: False,
  • Optionally, use web API for artworks with the empty caption,
  • Use Android app API for everything else.

and, more importantly, would need exported cookies for authentication, which expire in a month or so.

It's the less problem than the missed images/descriptions (that may contain useful links).


Using an other API endpoints seems very simple, however, they return the JSON data is formatted a bit different way, as I see.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests