New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for JS #2

Closed
shaneaevans opened this Issue Apr 1, 2014 · 19 comments

Comments

Projects
None yet
@shaneaevans
Member

shaneaevans commented Apr 1, 2014

Add support for JS based sites. It would be nice to have UI support for configuring instead of having to do it manually at the Scrapy level.

Perhaps we can allow users to enable or disable sending requests via Splash.

@duendex duendex added the feature label Apr 2, 2014

@kalessin

This comment has been minimized.

Member

kalessin commented Apr 2, 2014

This would imply to add a downloader middleware for slybot which filters urls via splash. right?

The UI would act for configuring this middleware.

@thoughtchad

This comment has been minimized.

thoughtchad commented Apr 11, 2014

Seems like there is already some support in here for JS. I see references to phantomjs, etc

@whatsdis

This comment has been minimized.

whatsdis commented Nov 4, 2014

how do you enable JS in portia?

@duendex

This comment has been minimized.

Contributor

duendex commented Nov 4, 2014

Portia does not support rendering js yet.

@problemss

This comment has been minimized.

problemss commented Nov 10, 2014

+1 for JS

@maoouyang

This comment has been minimized.

maoouyang commented Mar 4, 2015

So that!! other question,when I input a long url ,the long url cut off so that the portia cant visit the long url page! is it bug?

@robertkeizer

This comment has been minimized.

robertkeizer commented Apr 1, 2015

So much yes in this.

I'll be taking a look through this seeing how far off it is.

@drprabhakar

This comment has been minimized.

drprabhakar commented Apr 1, 2015

Is it possible to support JS by Portia by using splash?

@agramesh

This comment has been minimized.

agramesh commented May 7, 2015

Hi i use this mechanism for javascript urls

DOWNLOADER_MIDDLEWARES = {
'xxxxxx.middleware.splash.SplashMiddleware': 725,
}

SPLASH_ENABLED = True
SPLASH_ENDPOINT = 'http://localhost:8050/render.html'
SPLASH_WAIT = 2
SPLASH_IMAGES = False
SPLASH_URL_PASS = (r'www.xxxxxxx.com',)
#SPLASH_URL_BLOCK = (r'badexample.com',)

CONCURRENT_REQUESTS_PER_DOMAIN = 1
CONCURRENT_REQUESTS = 16

  • I didnt receive javascript enable error, portia is continuously loading the url without end, but in command prompt error load as
    --- ---
    File "/usr/lib64/python2.7/site-packages/twisted/internet/defer.py", line 577, in _runCallbacks
    current.result = callback(current.result, _args, *_kw)
    File "/home/test/portia-master/portia-master/slyd/slyd/bot.py", line 97, in fetch_callback
    request = response.meta['twisted_request']
    exceptions.KeyError: 'twisted_request'
  • is anything possible to solve this please reffer.
@ruairif

This comment has been minimized.

Contributor

ruairif commented May 7, 2015

It's possible that the middleware is accidentally removing the twisted_request object which is used to return the results from the spider. If you find where it's doing that and make sure it doesn't get removed then it should work.

@agramesh

This comment has been minimized.

agramesh commented May 27, 2015

Hi is there any updates in new release of portia for javascript urls, still i have some error to configure the above mechanisms.

@agramesh

This comment has been minimized.

agramesh commented May 28, 2015

Hi anybody have the reference for portia-splash middleware

@ruairif

This comment has been minimized.

Contributor

ruairif commented May 28, 2015

What do you mean by reference?

@agramesh

This comment has been minimized.

agramesh commented May 28, 2015

I used scrapinghub splash middleware but its not working for javascript in portia , the above mechanism I mentioned is work perfect but i receive empty pages from portia ui, so i think the middleware is not support, can you suggest any idea for this

@dvdbng dvdbng self-assigned this May 28, 2015

@ruairif

This comment has been minimized.

Contributor

ruairif commented May 28, 2015

Did you try adding the middleware to slybot/slybot/settings.py?

@agramesh

This comment has been minimized.

agramesh commented May 28, 2015

i tried in slyd/slyd/settings.py, there i create a local_settings.py file for above mechanism and import it to settings.py, is this correct? or please let me know how to configure it

@ruairif

This comment has been minimized.

Contributor

ruairif commented May 28, 2015

Try putting it in:

/slybot/slybot/local_slybot_settings.py
@agramesh

This comment has been minimized.

agramesh commented May 28, 2015

ok i will try and update with you, thanks

@agramesh

This comment has been minimized.

agramesh commented May 29, 2015

i tried your suggestion "/slybot/slybot/local_slybot_settings.py"
same error i received from portia ui, so please can u give any referrence url for middleware
python file

@dvdbng dvdbng closed this Aug 28, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment