Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Two Issues with Manual selection #62

Closed
DPalmz opened this issue Nov 7, 2020 · 13 comments
Closed

Two Issues with Manual selection #62

DPalmz opened this issue Nov 7, 2020 · 13 comments

Comments

@DPalmz
Copy link

DPalmz commented Nov 7, 2020

I have been unable to download with chapter to chapter selection, even after checking that I was doing the correct thing with one of the previous issues here. I get this error.
[ERROR]Cannot invoke "org.jsoup.nodes.Element.absUrl(String)" because the return value of "org.jsoup.select.Elements.first()" is null

Second thing is less of a concern, but slightly annoying, after trying to do chapter to chapter, even if I delete the selection, I am unable to do a table of contents manual download; I get the same error, as if it still thinks I am still doing a chapter to chapter.

@Flameish
Copy link
Owner

Flameish commented Nov 8, 2020

I'm currently redoing parts of NG, so I can't really fix this for you right now. Hopefully the new release will have these issues fixed already.

EDIT: Can you link the novel you're trying to download? I'll take a look at it as everything is working as intended when I try on 3.1.1

@DPalmz
Copy link
Author

DPalmz commented Nov 10, 2020

Thank you for replying. It makes sense that you wouldn't bugfix if you're working on a new version. I hope the new one works for me too.

I've tried it with http://www.talesofmu.com/ and https://web.archive.org/web/20200216171325/http://www.addergoole.com/9/
(though I think I only managed to learn how to successfully do it with the first one)

Also, let me say, I'm really glad your program works with web archived stories, cause my god sometimes would I go to read some of these only to find they only exist there now.

@Flameish
Copy link
Owner

Tales Of Mu has EPUBs available, please think about supporting the author and getting their version! :)

Never thought about using Wayback machine myself but it's great that it works out of the box.
I took a look at Addergole and these options seems to work just fine (only for the first 40 chapters or so, not everything got archived after that point):
First chapter: https://web.archive.org/web/20160510181758/http://www.addergoole.com/9/2012/09/chapter-1-wylie
Last chapter: (Im not sure when exactly it stopped working)
Button: a[rel=next]

@Flameish
Copy link
Owner

Is your problem still persistent in the new version?

@DPalmz
Copy link
Author

DPalmz commented Nov 19, 2020

yes, my issue is fixed, thanks. The new version is a lot smoother too.

@DPalmz
Copy link
Author

DPalmz commented Nov 19, 2020

How would I go about manually selecting the chapter container though? I've run across some sites where the autodetect isn't grabbing the chapter container.

@Flameish
Copy link
Owner

Flameish commented Nov 19, 2020

The chapter container selection is using CSS Selectors to select the correct HTML element. It is pretty easy to understand even without any prior HTML knowledge:

Just right click into the chapter body and select Inspect Element or if right clicking is disabled on the website, open the inspector tool manually; (Firefox: F12, not sure about Chrome) or via the menu -> dev tools -> inspector.

Screenshot from 2020-11-19 11-16-44

Next you have to find the container which contains the chapter text. It's probably the longest one and/or has many <p> (paragraph) element tags.

Screenshot from 2020-11-19 11-18-15
It already displays the CSS selector at the top: div.chapter-inner.chapter-content

Or you could copy it directly if you right click on the container and select Copy -> CSS selector.

Screenshot from 2020-11-19 11-19-35

Don't forget that you can test your selection via the Preview Chapter function based on your input.

Screenshot from 2020-11-19 11-42-15

If you need even more specific control over your selection you can take a look at the Jsoup selector syntax page. Jsoup also has a live testing page which is pretty useful to find the correct selector.

You can also use CSS selectors to remove content from the chapter via the Edit Blacklist tags window.

Screenshot from 2020-11-19 12-06-17
All <p> elements and elements which have the ads class will be removed.

Blacklisted tags will also reflect on the preview window as you can (not) see.

Screenshot from 2020-11-19 12-07-00

@DPalmz
Copy link
Author

DPalmz commented Nov 19, 2020

So I'm getting the error (Cannot invoke "org.jsoup.nodes.Element.absUrl(String)" because the return value of "org.jsoup.select.Elements.first()" is null)
It is picking up the text, but for the chapter to chapter selection, when it gets this error for some reason the program just keeps cycling to the next chapter even if said chapter doesn't exist and is past the last chapter given.
I tested this with https://www.eviscerati.org/fiction/arbsl/2013/10/rake-starlight-chapter-01/ and https://caelum-lex.com/.
I got the same error with both, but did manage to get an epub out of caelum lex since it had a table of contents

@Flameish
Copy link
Owner

Both worked perfectly fine for me with .nav-next a as the next button. Didn't pay attention on caelum and got stuck in a loop through but rake (it also has a table of contents from what I've seen) finished just fine.

That message sounds like it couldn't get the correct href (or any). Can you post what you've entered?

@DPalmz
Copy link
Author

DPalmz commented Nov 20, 2020

So I've been using opera for this. Maybe a different browser would be better as I never got anything like .nav-next a out of the css selector.
I tried a few different combinations of things, both with auto chapter container select and with manual. For Rake by Starlight (yeah, it does look like it has a table of contents, though I didn't know if it would be detected so I didn't try) I've tried #post-6593 > footer > nav > div > div > a and body.post-template-default.single.single-post.postid-6593.single-format-standard.custom-background.wp-custom-logo.wp-embed-responsive.author-hidden:nth-child(2) div.hfeed.site:nth-child(1) div.site-content.container.clearfix section.content-area main.site-main article.post-6593.post.type-post.status-publish.format-standard.has-post-thumbnail.hentry.category-arbsl footer.entry-footer nav.navigation.post-navigation div.nav-links div.nav-next > a:nth-child(1) This long one is what came out of a css addon I got.
I did also try a rake of starlight with .nav-next a and was able to download it. So yes, it does seem to be a problem with my inputs.

@Flameish
Copy link
Owner

Flameish commented Nov 21, 2020

Never tried with Opera so it's good to see that it works there! That selector looks horrible lol, they should never be that long. It looks like a unique one too. You might want to take look at the different sources for examples. Search for select inside the files.

@DPalmz
Copy link
Author

DPalmz commented Nov 22, 2020

Thank you for your help

@asrind11
Copy link

asrind11 commented Feb 1, 2023

A request to the author of the Novel-Grabber program to add support for the site https://ranobe-novels.ru - and not a single bot that I found on the Internet and was able to run can download files from the site ranobe-novels.ru

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants