New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Onboard NASA wildfire #275
feat: Onboard NASA wildfire #275
Conversation
6ed891b
to
3120646
Compare
datasets/nasa_wildfire/pipelines/_images/run_csv_transform_kub/csv_transform.py
Outdated
Show resolved
Hide resolved
download_file(source_url, source_file) | ||
|
||
logging.info("Reading file ...") | ||
df = pd.read_csv(str(source_file)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be chunking? What is the typical # records/file size?
datasets/nasa_wildfire/pipelines/_images/run_csv_transform_kub/csv_transform.py
Outdated
Show resolved
Hide resolved
datasets/nasa_wildfire/pipelines/_images/run_csv_transform_kub/csv_transform.py
Outdated
Show resolved
Hide resolved
TARGET_GCS_BUCKET: "{{ var.value.composer_bucket }}" | ||
TARGET_GCS_PATH: "data/nasa_wildfire/past_week/data_output.csv" | ||
PIPELINE_NAME: "past_week" | ||
CSV_HEADERS: >- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change to multiline
@adlersantos @happyhuman @nlarge-google please review the code after the changes as per review comments |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one change - remove storage bucket in dataset.yaml.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved
Description
Pipeline: past_week
Checklist
Note: Delete items below that aren't applicable to your pull request.
datasets/nasa_wildfire
and nothing outside of that directory.