Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't make DB query when parsing an article. #27

Open
supersam654 opened this issue Feb 19, 2016 · 1 comment
Open

Don't make DB query when parsing an article. #27

supersam654 opened this issue Feb 19, 2016 · 1 comment

Comments

@supersam654
Copy link
Contributor

We're making a DB query to figure out which tags to remove on a per-source basis when parsing articles. This can cause deadlocks on some systems and warnings on others (I believe warnings on Debian and possible deadlocks on OSX). It would be better if we got the source-specific cleaning stuff from the db and passed the actual data along with the article HTML to get parsed.

A sample error is:

/home/bdc/anaconda2/lib/python2.7/site-packages/pymongo/topology.py:74: UserWarning: MongoClient opened before fork. Create MongoClient with connect=False, or create client after forking. See PyMongo's documentation for details: http://api.mongodb.org/python/current/faq.html#using-pymongo-with-multiprocessing>
  "MongoClient opened before fork. Create MongoClient "
@supersam654 supersam654 self-assigned this Feb 26, 2016
@supersam654
Copy link
Contributor Author

This is actually a breaking issue on Windows and is pretty severe. Hopefully someone remedies it soon.

@supersam654 supersam654 removed their assignment Mar 14, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant