Gb/misc features #72

grantbuster · 2019-11-26T20:38:09Z

Addition of four features:

Generation can now pass through the main resource data arrays (dni, dhi, windspeed) (issue Generation to pass through resource data #71).
The collection method that purges the chunked node files will now check to make sure that all source datasets are in the final output file before deleting chunked files. If any datasets are missing (not collected), a warning is printed and the chunks are not deleted (issue Collection purge chunks to protect against incomplete collection #70).
Representative profiles can now take a profile weighting argument which is input as the "gid_counts" such that rep profiles are weighted by their exclusion fraction (issue Representative Profiles should consider weighting #66).
A low-memory collection method was added to the collection handler. This collects one dataset from one file chunk at a time and writes directly from the source file to the final output file. Travis tested this and it works.

Pytests added for all except 2 (which i tested manually on eagle).

…ndspeed

…idate.

…Tests pass

MRossol

create a log_memory function in loggers.py to replace the memory logging throughout reV:
log_memory(logger, level='DEBUG')
Update DatasetCollector to no longer use SmartParallelJob and instead transfer file by file, dataset by dataset with smart logic to check to see if the entire file/dataset can fit in memory, if not then transfer as many chunks as will fit in memory at a time until file/dataset has been transfered

…id/meta data

…ods including smartparalleljob. Tests pass

…nks). Added test to make sure it works.

grantbuster · 2019-11-27T16:56:16Z

@MRossol, added those two features. Cleaned up collection a lot. Made a low memory test, although I realize now that this will be kind of hardware dependent. Still, works on my machine!

MRossol

The architecture looks great, much cleaner.
A few questions comments on method names

MRossol · 2019-11-27T17:01:17Z

reV/handlers/collection.py

-        """
-        Add results from SmartParallelJob to out
+    @staticmethod
+    def _get_site_mem_req(shape, dtype, n=100):


I'm confused is this the memory for one site, one chunk, or the whole dataset?

Just for one site. I realize i forgot to update some of the docstrings. Fixed.

MRossol · 2019-11-27T17:04:44Z

reV/handlers/collection.py

+                             .format(os.path.basename(fp_source), e))
+            raise e
+
+    def _low_mem_collect(self):


if this is the standard collection method I think we should rename it to just _collect

reV/handlers/collection.py

MRossol · 2019-11-27T17:17:04Z

Rebase and merge away

grantbuster added 6 commits November 26, 2019 09:57

added method for passing through common resource vars: dni dhi and wi…

b82e342

…ndspeed

added logic to check full dset collection before purge.

4fabb23

logging edit

5025ac4

added feature to weight rep profiles by gid counts. Added test to val…

58ce1ec

…idate.

added low memory collection method. Made default for cli collection. …

d45e47d

…Tests pass

improved test and logging

7548e53

grantbuster requested a review from MRossol November 26, 2019 20:38

MRossol requested changes Nov 26, 2019

View reviewed changes

grantbuster added 3 commits November 26, 2019 16:50

improved logging statements and added a few protections against bad g…

493cd24

…id/meta data

cleaned up the collection module by removing parallel collection meth…

955c3b1

…ods including smartparalleljob. Tests pass

Added memory-limited dataset collection (will collect datasets in chu…

08b8699

…nks). Added test to make sure it works.

MRossol requested changes Nov 27, 2019

View reviewed changes

minor review edits

4d85924

MRossol reviewed Nov 27, 2019

View reviewed changes

reV/handlers/collection.py Show resolved Hide resolved

MRossol approved these changes Nov 27, 2019

View reviewed changes

grantbuster merged commit 092361f into master Nov 27, 2019

grantbuster deleted the gb/misc_features branch November 27, 2019 17:30

This was referenced Nov 27, 2019

Generation to pass through resource data #71

Closed

Collection purge chunks to protect against incomplete collection #70

Closed

Representative Profiles should consider weighting #66

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gb/misc features #72

Gb/misc features #72

grantbuster commented Nov 26, 2019 •

edited

Loading

MRossol left a comment

grantbuster commented Nov 27, 2019

MRossol left a comment

MRossol Nov 27, 2019

grantbuster Nov 27, 2019

MRossol Nov 27, 2019

grantbuster Nov 27, 2019

MRossol commented Nov 27, 2019

Gb/misc features #72

Gb/misc features #72

Conversation

grantbuster commented Nov 26, 2019 • edited Loading

MRossol left a comment

Choose a reason for hiding this comment

grantbuster commented Nov 27, 2019

MRossol left a comment

Choose a reason for hiding this comment

MRossol Nov 27, 2019

Choose a reason for hiding this comment

grantbuster Nov 27, 2019

Choose a reason for hiding this comment

MRossol Nov 27, 2019

Choose a reason for hiding this comment

grantbuster Nov 27, 2019

Choose a reason for hiding this comment

MRossol commented Nov 27, 2019

grantbuster commented Nov 26, 2019 •

edited

Loading