-
Notifications
You must be signed in to change notification settings - Fork 428
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Access to plain text of a page? #12
Comments
@JorjMcKie, yes there should be. I'll add the API for extracting image/text from a page. |
Hi @rk700,
I am afraid, I might have destroyed your change concerning text extraction? I will try it out ... |
@JorjMcKie, thank you for your work on the project! Especially when I could not devote too much time on it. For closing doc, I think we can open a new issue. Currently , And don't worry about the text extraction code, it can be recovered easily from the commit history:) |
@rk700, I have done a "repair" to the extractText, so it again works. |
@JorjMcKie, the global context would live as long as the module, so it is freed when the program exits. And we can use the general function |
Text extraction tested - closing the issue. |
Is there a way to access the text contained in a page and e.g. analyze it inside Python?
The text was updated successfully, but these errors were encountered: