Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tried to stop a LoopingCall that was not running #2011

Closed
ramwin opened this issue May 26, 2016 · 12 comments
Closed

Tried to stop a LoopingCall that was not running #2011

ramwin opened this issue May 26, 2016 · 12 comments
Assignees

Comments

@ramwin
Copy link

@ramwin ramwin commented May 26, 2016

I'm using mysql to store my spider data, but when I set the piplines to store data into local mysql server, it raise an error.
pipelines.py

import json
import requests
from mysql.connector import connection

MYSQL_SERVER = '192.168.1.90'   # using this, the spider can run
MYSQL_SERVER = 'localhost'   # using this, the spider raise an error
MYSQL_DB = 'scrapy'
MYSQL_USER = 'crawler'
MYSQL_PASS = 'crawl'
MYSQL_TABLE = 'pm25in'

class Pm25InPipeline(object):
    def __init__(self):                                                                
        pass

    def process_item(self, item, spider):                                              
        command = '''insert into {table} (monitortime, monitorcity, monitorpoint,      
            AQIindex, airsituation, primarypullutant, PM25content, PM10content,        
            CO, NO2, O3_1h, O3_8h, SO2)                                                
            values ( "{monitortime}", "{monitorcity}","{monitorpoint}", "{AQIindex}",  
            "{airsituation}", "{primarypollutant}", {PM25content}, {PM10content},      
            {CO}, {NO2}, {O3_1h}, {O3_8h}, {SO2} );                                    
            '''.format(table=MYSQL_TABLE, **dict(item))                                
        self.cursor.execute(command)                                                   
        return item                                                                    

    def open_spider(self, spider):                                                     
        self.cnx = connection.MySQLConnection(                                         
            host=MYSQL_SERVER,                                                         
            user=MYSQL_USER,                                                           
            password=MYSQL_PASS,                                                       
            database=MYSQL_DB,                                                         
            charset='utf8')                                                            
        self.cursor = self.cnx.cursor()                                                

    def close_spider(self,spider):                                                     
        self.cnx.commit()                                                              
        self.cnx.close()                                                               

the Traceback:

wangx@wangx-PC:~/github/pm25in$ scrapy crawl pm25spider
Unhandled error in Deferred:


Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/scrapy/commands/crawl.py", line 57, in run
    self.crawler_process.crawl(spname, **opts.spargs)
  File "/usr/local/lib/python2.7/dist-packages/scrapy/crawler.py", line 163, in crawl
    return self._crawl(crawler, *args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/scrapy/crawler.py", line 167, in _crawl
    d = crawler.crawl(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/twisted/internet/defer.py", line 1274, in unwindGenerator
    return _inlineCallbacks(None, gen, Deferred())
--- <exception caught here> ---
  File "/usr/local/lib/python2.7/dist-packages/twisted/internet/defer.py", line 1126, in _inlineCallbacks
    result = result.throwExceptionIntoGenerator(g)
  File "/usr/local/lib/python2.7/dist-packages/twisted/python/failure.py", line 389, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
  File "/usr/local/lib/python2.7/dist-packages/scrapy/crawler.py", line 87, in crawl
    yield self.engine.close()
  File "/usr/local/lib/python2.7/dist-packages/scrapy/core/engine.py", line 100, in close
    return self._close_all_spiders()
  File "/usr/local/lib/python2.7/dist-packages/scrapy/core/engine.py", line 340, in _close_all_spiders
    dfds = [self.close_spider(s, reason='shutdown') for s in self.open_spiders]
  File "/usr/local/lib/python2.7/dist-packages/scrapy/core/engine.py", line 298, in close_spider
    dfd = slot.close()
  File "/usr/local/lib/python2.7/dist-packages/scrapy/core/engine.py", line 44, in close
    self._maybe_fire_closing()
  File "/usr/local/lib/python2.7/dist-packages/scrapy/core/engine.py", line 51, in _maybe_fire_closing
    self.heartbeat.stop()
  File "/usr/local/lib/python2.7/dist-packages/twisted/internet/task.py", line 202, in stop
    assert self.running, ("Tried to stop a LoopingCall that was "
exceptions.AssertionError: Tried to stop a LoopingCall that was not running.
@J-Hong
Copy link

@J-Hong J-Hong commented Jun 1, 2016

Hey, I has a same error log like yours, and my problem is "connection.MySQLConnection()" not connect successfully.

When I connect to my MySQL, Scrapy Crawl is work!

Hope this help for you.

@leearic
Copy link

@leearic leearic commented Jun 1, 2016

mysql的连接,执行和关闭. 你为啥不直接放到 process_item 一次性完成呢?

@J-Hong
Copy link

@J-Hong J-Hong commented Jun 1, 2016

process_item 內若要一直使用到 MySQL 的話,只在 open_spider 與 close_spider 做一次 開啟/關閉 應該會比較好吧

@leearic
Copy link

@leearic leearic commented Jun 1, 2016

我都是用django的... 感觉scrapy+django 配合起来很好用

@shartoo
Copy link

@shartoo shartoo commented Jun 3, 2016

i got same error,and found this occur because of file "setting" configuration not right,you may check it.

@widnyana
Copy link

@widnyana widnyana commented Jun 7, 2016

@shartoo can you give some hint about the "setting" configuration?

@shartoo
Copy link

@shartoo shartoo commented Jun 7, 2016

like this

ITEM_PIPELINES = {
    'bdtb.pipelines.bdtb_pipeline': 300,
    'bdtb.item2mysql_pipeline.MySQLStorePipeline':300
}

you should check if the path is right.

@desertjinn
Copy link

@desertjinn desertjinn commented Jul 1, 2016

I have a working Scrapy project with two named spiders that are running successfully with Scrapy v1.0.1 and Python v2.7.9.
One of the two spiders uses MySQL-python library to connect to a MySQL database.

Today, I updated Python to 2.7.11+ as well as Scrapy to 1.1.0 and received a similar error

Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/scrapy/commands/crawl.py", line 57, in run
    self.crawler_process.crawl(spname, **opts.spargs)
  File "/usr/local/lib/python2.7/dist-packages/scrapy/crawler.py", line 163, in crawl
    return self._crawl(crawler, *args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/scrapy/crawler.py", line 167, in _crawl
    d = crawler.crawl(*args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 1274, in unwindGenerator
    return _inlineCallbacks(None, gen, Deferred())
--- <exception caught here> ---
  File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 1126, in _inlineCallbacks
    result = result.throwExceptionIntoGenerator(g)
  File "/usr/lib/python2.7/dist-packages/twisted/python/failure.py", line 389, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
  File "/usr/local/lib/python2.7/dist-packages/scrapy/crawler.py", line 87, in crawl
    yield self.engine.close()
  File "/usr/local/lib/python2.7/dist-packages/scrapy/core/engine.py", line 100, in close
    return self._close_all_spiders()
  File "/usr/local/lib/python2.7/dist-packages/scrapy/core/engine.py", line 340, in _close_all_spiders
    dfds = [self.close_spider(s, reason='shutdown') for s in self.open_spiders]
  File "/usr/local/lib/python2.7/dist-packages/scrapy/core/engine.py", line 298, in close_spider
    dfd = slot.close()
  File "/usr/local/lib/python2.7/dist-packages/scrapy/core/engine.py", line 44, in close
    self._maybe_fire_closing()
  File "/usr/local/lib/python2.7/dist-packages/scrapy/core/engine.py", line 51, in _maybe_fire_closing
    self.heartbeat.stop()
  File "/usr/lib/python2.7/dist-packages/twisted/internet/task.py", line 202, in stop
    assert self.running, ("Tried to stop a LoopingCall that was "
exceptions.AssertionError: Tried to stop a LoopingCall that was not running.

I have also rectified my settings.py file as recommended by @shartoo with no luck:

ITEM_PIPELINES = {
    'tttscraper.pipelines.MySQLStorePipeline':300,
    'tttscraper.pipelines.CustomFilePipeline':300,
}

For now, I've reverted Scrapy to version 1.0.1 and the code works again. (Python is still at v2.7.11+)

@frkhit
Copy link

@frkhit frkhit commented Jul 5, 2016

I met this error, when my pipeline.py had some bug.

@widnyana
Copy link

@widnyana widnyana commented Sep 13, 2016

updating my custom scheduler fix this issue.

my case was pqclass is missing from from_crawler classmethod. for futher info check and compare your custom scheduler with scrapy's original scheduler

got the hint after trying scrapy shell <URL GOES HERE> and found some error trace leading to my buggy customscheduler.

@ohmycloud
Copy link

@ohmycloud ohmycloud commented Oct 8, 2016

AssertionError: Tried to stop a LoopingCall that was not running.

@guneysus
Copy link

@guneysus guneysus commented Oct 23, 2016

I was having this issue even it works the day before with pymongo.

I was able to fix reinstalling pymongo with pip install pymongo --upgrade.

No version bumbed but problem fixed.

redapple added a commit to redapple/scrapy that referenced this issue Nov 7, 2016
Also add a test on state of looping task in LogStats extension

Fixes scrapy#2011 and scrapy#2362
@redapple redapple self-assigned this Nov 7, 2016
@redapple redapple added the in progress label Nov 7, 2016
@kmike kmike closed this in #2382 Nov 8, 2016
@redapple redapple removed the in progress label Nov 8, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

10 participants