-
Notifications
You must be signed in to change notification settings - Fork 257
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor parallel subpackage to use dataframes and grouping #887
Conversation
* Adds a new term rotation - we previously used frame but frame is separately used so a new term made sense * Replaced plantbarcode with barcode to fit a broader range of applications
Replaces "%Y-%m-%d %H:%M:%S.%f" with "%Y-%m-%dT%H:%M:%S.%fZ"
Replaces metadata_parser with a new modular workflow that parses three types of datasets and uses a dataframe structure to do metadata filtering
The workflow configuration template needed to be updated to match updates to WorkflowConfig
* Add a new module for standardizing and implementing workflow command-line and notebook input arguments * Update job_builder to plug inputs into the new argument framework * Update multiprocess tests
Still needs work to reduce complexity
Codecov Report
@@ Coverage Diff @@
## 4.x #887 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 159 160 +1
Lines 6738 6714 -24
=========================================
- Hits 6738 6714 -24
Flags with carried forward coverage won't be shown. Click here to find out more.
|
Error handling for "images" input in Workflow Image when input is not a list. |
New version breaks something in acute_vertex that we can figure out later
@JorgeGtz found an issue with the I think I can refactor |
Describe your changes
This PR makes several major changes to the parallel subpackage.
.github/workflows/continuous-integration.yml
:pip
to install PlantCV instead ofsetup.py
.plantcv/parallel/__init__.py
:convert_datetime_to_unixtime
andcheck_date_range
.workflow_inputs
function andWorkflowInputs
class. These are from new module for handling workflow inputs in Jupyter notebooks and scripts was added to make it easier to migrate from Jupyter to a parallel workflow script.WorkflowConfig
default timestamp format was updated to an ISO 8601 UTC datetime.WorkflowConfig
coprocess
attribute and replace withgroupby
andgroup_name
attributes. The new attributes are used to group images in the new dataframe-based metadata parser framework and name the image inputs to parallel workflows.rotation
metadata attribute was added toWorkflowConfig
.plantbarcode
metadata attribute inWorkflowConfig
was renamed tobarcode
to be more general.plantcv/parallel/parsers.py
:phenodata
) was added to the parsers module.plantcv/parallel/job_builder.py
:workflow_inputs
-basedargparse
framework.image1
,image2
, etc.)plantcv/parallel/workflow_inputs.py
:WorkflowInputs
class is used to set Jupyter notebook input variables in a framework that is compatible with the command-line arguments used in parallel workflow scripts.workflow_inputs
function creates a standardizedargparse
command-line argument parser for workflows.plantcv/parallel/process_results.py
:plantcv/utils/converters.py
:json2csv
util function was updated to handle grouped output data.json2csv
now only outputs a single CSV file in long format.Additionally, relevant tests were added/updated. Documentation was updated where necessary.
Type of update
Is this a: New feature or feature enhancement
Associated issues
Closes #474
Closes #423
Closes #538
Replaces #759