Skip to content

How to use multi instance of browser simultaneously crawler pages? #11

Closed
yi719 opened this Issue Sep 26, 2012 · 4 comments

2 participants

@yi719
yi719 commented Sep 26, 2012

No description provided.

@kiorky
Makina Corpus member
kiorky commented Sep 28, 2012

Please be more precise on what you'd want to achieve.

@yi719
yi719 commented Oct 6, 2012

I want to use multiprocess or threading to crawl pages simultaneously, as code shown below.

But i got this info: 'WARNING: QApplication was not created in the main() thread'

from multiprocessing.pool import ThreadPool

import spynner


url = 'http://google.com/'

def main():
    pool = ThreadPool(10)
    [pool.apply_async(crawl, (url,)) for i in range(1, 100)]



def crawl(url):
    browser = spynner.Browser(debug_level=spynner.INFO)
    browser.create_webview()
    browser.load(url)


if __name__ == '__main__':
    main()
@kiorky
Makina Corpus member
kiorky commented Oct 7, 2012

I think it s more related with PyQt and some incompatibilities with multiprocess.

See : http://doc.qt.digia.com/4.2/threads.html

I will investigate more.

@kiorky
Makina Corpus member
kiorky commented Oct 7, 2012

I ve not finished to investigate but im not sure you can do multiprocessing like that.
For me, you will have to spawn another process with a regular exec as you cant have multithreading + qt without hacks in c++ and seems to be impossible from pyqt.

@kiorky kiorky closed this Oct 21, 2012
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.