Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch upload strategy for "book mode" pixiv manga #2608

Closed
Type-kun opened this issue Jun 8, 2016 · 5 comments
Closed

Batch upload strategy for "book mode" pixiv manga #2608

Type-kun opened this issue Jun 8, 2016 · 5 comments

Comments

@Type-kun
Copy link
Collaborator

Type-kun commented Jun 8, 2016

Pixiv seems to have a new mode for displaying manga pages, with a reader much like one on comic.pixiv.net, but thankfully with less obsessive protection. It's possible to grab the links to the individual pages, they're embedded into <script> elements inside <head>, regexp for those should be pretty simple.

The bad thing is URL is exactly the same for regular manga mode and "book" mode, so it'll be necessary to find some kind of marker inside the page itself.


Example:

http://www.pixiv.net/member_illust.php?mode=manga&illust_id=57045668
In the html code, there'll be multiple elements like this (linebreaks added for clarity):

<script>
pixiv.context.images[0] = "http:\/\/i1.pixiv.net\/c\/1200x1200\/img-master\/img\/2016\/05\/24\/20\/50\/19\/57045668_p0_master1200.jpg";
pixiv.context.thumbnailImages[0] = "http:\/\/i1.pixiv.net\/c\/128x128\/img-master\/img\/2016\/05\/24\/20\/50\/19\/57045668_p0_square1200.jpg";
pixiv.context.originalImages[0] = "http:\/\/i1.pixiv.net\/img-original\/img\/2016\/05\/24\/20\/50\/19\/57045668_p0.jpg";
</script>

We need to find code starting with pixiv.context.originalImages[NN] =, then get the text unil next semicolon and remove the unnecessary escaping.

@r888888888
Copy link
Collaborator

Maybe it's better to just use the Pixiv web api to get this data.

@Type-kun
Copy link
Collaborator Author

Perhaps, I'm not familiar with web api. How does one access it?

@r888888888
Copy link
Collaborator

There's a dedicated wrapper for it here: https://github.com/r888888888/danbooru/blob/master/app/logical/pixiv_api_client.rb

Using it is kinda iffy because it's not officially documented or supported. But it returns a lot of useful data without having to scrape any HTML.

@r888888888
Copy link
Collaborator

deployed the change to testbooru

@Type-kun
Copy link
Collaborator Author

Seems to work just fine for both old and new styles.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants