Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kartothek Integration #137

Merged
merged 14 commits into from
Mar 16, 2020
Merged

Kartothek Integration #137

merged 14 commits into from
Mar 16, 2020

Conversation

brendancol
Copy link
Contributor

No description provided.

…oned parquet file and provides same API as original Dataset class
@brendancol brendancol requested a review from dharhas March 15, 2020 03:42
Copy link
Contributor

@timothydmorton timothydmorton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! Is this ready for me to check out on lsst-dev?

if self.tracts is None:
self.tracts = list(set(x for v in self.metadata['visits'].values() for x in v.keys()))

def parse_metadata_from_bulter(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo here (bulter -> butler)

except:
raise
print(f'{self.path} is not available in Butler attempting to read parquet files instead')
self.parse_metadata_from_bulter()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same typo

Comment on lines +64 to +65
except:
raise
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A good thing to do here would be to catch & print the exception information before proceeding to test data, since often this code may hide some stack problem even if actually on the LSST system

self.tracts = list(set(x for v in self.metadata['visits'].values() for x in v.keys()))

def fetch_coadd_table(self, coadd_version='unforced'):
table = 'qaDashboardCoaddTable'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we will be using any dataset with this name anymore.

# hack label in ...
coadd_df['label'] = 'star'

self.coadd[table] = coadd_df
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's change the key here (and wherever else necessary) to be dataset rather than table, because the above table definition will not exist in the new format.

ddf = self.get_visits_by_metric_filter(filt, metric)
except:
print('WARNING: problem loading visits for {} metric and {} filter'.format(metric, filt))
ddf = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe print a traceback here, too? Or tighten the except to catch precisely what happens when the metric doesn't exist in the visit data?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@timothydmorton yes, I think at least using traceback.print_exc() would help the cause. The PR is getting large so I'm going to hit these items in a follow up

@timothydmorton
Copy link
Contributor

timothydmorton commented Mar 15, 2020

Can we also on this PR update the test data so that the CI works? (maybe make a 0.1% (or smaller) dataset or something if the download speed is an issue?)

@dharhas
Copy link
Member

dharhas commented Mar 16, 2020

@timothydmorton the 1% is still too big, I'll try for a 0.1 or 0.01% dataset.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants