feat(downsample_cells_template): add downsample_cells_template functi…#393
feat(downsample_cells_template): add downsample_cells_template functi…#393fangliu117 merged 1 commit intodevfrom
Conversation
…on and unit tests
There was a problem hiding this comment.
Pull Request Overview
This PR adds a new downsample_cells_template function that provides a platform-agnostic interface for cell downsampling operations, converted from NIDAP template functionality. The implementation includes comprehensive parameter handling, multiple input format support, and flexible output options.
- Implements
run_from_jsonfunction with support for JSON, dict, and direct DataFrame inputs - Adds comprehensive test coverage including I/O workflows, error validation, and function mocking
- Provides CLI interface and handles multiple file formats (CSV, pickle)
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
src/spac/templates/downsample_cells_template.py |
Main template implementation with parameter parsing, data loading, and downsample execution |
tests/templates/test_downsample_cells_template.py |
Comprehensive unit tests covering various input formats, error cases, and function validation |
| import pickle | ||
|
|
||
| # Add parent directory to path for imports | ||
| sys.path.append(str(Path(__file__).parent.parent.parent)) |
There was a problem hiding this comment.
Modifying sys.path at runtime is an anti-pattern that can cause import issues. Consider using relative imports or proper package structure instead.
| sys.path.append(str(Path(__file__).parent.parent.parent)) |
| sys.path.append( | ||
| os.path.dirname(os.path.realpath(__file__)) + "/../../src" | ||
| ) |
There was a problem hiding this comment.
Modifying sys.path in tests is an anti-pattern. Consider using proper test discovery mechanisms or PYTHONPATH configuration instead.
| sys.path.append( | |
| os.path.dirname(os.path.realpath(__file__)) + "/../../src" | |
| ) |
| print("Downsampled! Processed dataset info:") | ||
| print(down_sampled_dataset.info()) |
There was a problem hiding this comment.
[nitpick] Using print statements in library code reduces flexibility for users who may want to control logging. Consider using the logging module instead.
| print("Downsampled! Processed dataset info:") | |
| print(down_sampled_dataset.info()) | |
| logging.info("Downsampled! Processed dataset info:") | |
| logging.info(down_sampled_dataset.info()) |
| print( | ||
| f"Downsample Cells completed → {saved_files[output_file]}" | ||
| ) | ||
| return saved_files | ||
| else: | ||
| # Return the dataframe directly for in-memory workflows | ||
| print("Returning DataFrame (not saving to file)") | ||
| return down_sampled_dataset | ||
|
|
||
|
|
||
| # CLI interface | ||
| if __name__ == "__main__": |
There was a problem hiding this comment.
[nitpick] Using print statements in library code reduces flexibility for users who may want to control logging. Consider using the logging module instead.
| print( | |
| f"Downsample Cells completed → {saved_files[output_file]}" | |
| ) | |
| return saved_files | |
| else: | |
| # Return the dataframe directly for in-memory workflows | |
| print("Returning DataFrame (not saving to file)") | |
| return down_sampled_dataset | |
| # CLI interface | |
| if __name__ == "__main__": | |
| logging.info( | |
| f"Downsample Cells completed → {saved_files[output_file]}" | |
| ) | |
| return saved_files | |
| else: | |
| # Return the dataframe directly for in-memory workflows | |
| logging.info("Returning DataFrame (not saving to file)") | |
| return down_sampled_dataset | |
| # CLI interface | |
| if __name__ == "__main__": | |
| logging.basicConfig(level=logging.INFO) |
| return saved_files | ||
| else: | ||
| # Return the dataframe directly for in-memory workflows | ||
| print("Returning DataFrame (not saving to file)") |
There was a problem hiding this comment.
[nitpick] Using print statements in library code reduces flexibility for users who may want to control logging. Consider using the logging module instead.
…on and unit tests