Add full data dump framework and implement for roundup #47

602p · 2014-04-15T21:11:37Z

This adds the framework to use --extended-scrape to dump a copy of the raw scraped data. It also implements this feature for the Roundup issue importer.

602p · 2014-04-16T00:08:23Z

Code has been merged and test have been run (except for those dumb 2.6 dependent ones >:( .)

ehashman · 2014-04-16T00:16:08Z

bugimporters/base.py

        # Store the tracker model
        self.tm = tracker_model
        # Store the reactor manager
        self.rm = reactor_manager
+        # Store wether or not to scrape messages, keywords, etc


Typo: "wether" -> whether

ehashman · 2014-04-16T00:58:26Z

bugimporters/main.py

+            try:
+                self.extended_scrape==False
+            except AttributeError:
+                self.extended_scrape=False


Suggestion: Assuming I understand the reason for this code---I think this pull request should include the addition of extended_scrape to all the importers, rather than catching this exception here.

Actually this is here to make sure it exists because is sometimes disappears when called from tests (thru no lack of debugging of my own >:? .)

I agree with @ehashman 's remark -- good spotting.

Also, PEP8 suggests using more spaces around these "=" characters. See http://legacy.python.org/dev/peps/pep-0008/ for more information on that.

paulproteus · 2014-04-16T13:56:52Z

bugimporters/main.py

@@ -130,13 +139,17 @@ def start_requests(self):
                logging.error("FYI, this bug importer does not support "
                              "process_bugs(). Fix it.")

-    def __init__(self, input_filename=None):
+    def __init__(self, input_filename=None, extended_scrape="False"):


You're using a string, "False", here, but can you use a boolean value, like False, instead? That would be much tidier, in my opinion.

This comment might not be super clear, so please feel free to ask me to clarify.

This is required because the value passed thru scrapy.cmdline.args get run thru str(), so for the sake of keeping the code consistent I made it a string.

(However this has been cleaned up!)

paulproteus · 2014-04-16T16:01:23Z

+1, merging now

@602p did you send an email with licensing info to the mailing list? If not, do that too.

Add full data dump framework and implement for roundup

602p added 17 commits April 15, 2014 13:07

Added file loading for roundup

9aa2af6

Add file scraping test

3530bc4

Added message scraping functionality (to roundup)

c60b826

Add --extended-scrape option to scrape messages, keywords, etc..

c13f8bb

Add raw_data dumping option to bug parser (for use with converse)

085bb7b

Add additional YAML example

481ea3b

Fixed, added tests

bae7c2c

Remove a debug command

e3fbb89

Added file loading for roundup

710fb0d

Add file scraping test

92432df

Added message scraping functionality (to roundup)

69fbc7b

Add --extended-scrape option to scrape messages, keywords, etc..

d87877f

Add raw_data dumping option to bug parser (for use with converse)

59906d1

Add additional YAML example

0167fc1

Fixed, added tests

1b266c1

Remove a debug command

ad8081f

Merge roundup.py

41def37

ehashman reviewed Apr 16, 2014
View reviewed changes

602p added 5 commits April 15, 2014 19:40

Move rawdata test JSON blob to its own file

689ed2b

Made JSON loading cleaner

ffa932e

Cleaned up JSON blob

bfb71be

Added a note about raw_data testing

cbee1f3

Im bad at spelling

6f2c164

ehashman reviewed Apr 16, 2014
View reviewed changes

Added Asheesh's frackalackadingdong comma

be36d67

paulproteus reviewed Apr 16, 2014
View reviewed changes

602p added 2 commits April 16, 2014 09:35

Removed an offending try-catch and added some whitespace

53999bf

Cleaned up extended_scrape passing

a62fb01

602p changed the title ~~Add full data dump framework and implemen for roundup~~ Add full data dump framework and implement for roundup Apr 16, 2014

602p added 7 commits April 16, 2014 10:17

PEP-8 ifiying

03ae989

PEP-8 ifiying further

0b228ae

PEP-8 ifiying even further

fe81194

PEP-8 ifiying EVEN further

2e06ceb

PEP-8 ifiying comments

f1b2cba

PEP-8 changes (AKA Guido is very opinionated)

271e569

PEP-8

bd2df90

paulproteus added a commit that referenced this pull request Apr 16, 2014

Merge pull request #47 from 602p/master

bce528a

Add full data dump framework and implement for roundup

paulproteus merged commit bce528a into openhatch:master Apr 16, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add full data dump framework and implement for roundup #47

Add full data dump framework and implement for roundup #47

602p commented Apr 15, 2014

602p commented Apr 16, 2014

ehashman Apr 16, 2014

ehashman Apr 16, 2014

602p Apr 16, 2014

paulproteus Apr 16, 2014

paulproteus Apr 16, 2014

602p Apr 16, 2014

602p Apr 16, 2014

paulproteus commented Apr 16, 2014

Add full data dump framework and implement for roundup #47

Add full data dump framework and implement for roundup #47

Conversation

602p commented Apr 15, 2014

602p commented Apr 16, 2014

ehashman Apr 16, 2014

Choose a reason for hiding this comment

ehashman Apr 16, 2014

Choose a reason for hiding this comment

602p Apr 16, 2014

Choose a reason for hiding this comment

paulproteus Apr 16, 2014

Choose a reason for hiding this comment

paulproteus Apr 16, 2014

Choose a reason for hiding this comment

602p Apr 16, 2014

Choose a reason for hiding this comment

602p Apr 16, 2014

Choose a reason for hiding this comment

paulproteus commented Apr 16, 2014