Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Installing optional dependencies per docs is not working #117

Closed
nsheff opened this issue Nov 9, 2023 · 6 comments
Closed

Installing optional dependencies per docs is not working #117

nsheff opened this issue Nov 9, 2023 · 6 comments
Assignees

Comments

@nsheff
Copy link
Contributor

nsheff commented Nov 9, 2023

When I try to install the db backend deps, it's not working:

pip install pipestat[db-backend]>=0.6.0a5
WARNING: pipestat 0.6.0a5 does not provide the extra 'db-backend'

Also, as an aside, you have pydantic in here twice:

https://github.com/pepkit/pipestat/blob/dev/requirements/requirements-db-backend.txt

@donaldcampbelljr
Copy link
Contributor

Using:
pip install pipestat[db-backend]==0.6.0a5 seems to work a bit better by installing pipestat, though NOT the desired optional dependencies.

The output led me to the .whl metadata file:
https://files.pythonhosted.org/packages/81/ea/4cfe555c56bb93ab9542f013d449e5743be5c89ca5938ec68750babc33d0/pipestat-0.6.0a5-py3-none-any.whl.metadata

Looking at this file, I notice pipestatreader is listed as extra, but I do not see db-backend:

Metadata-Version: 2.1
Name: pipestat
Version: 0.6.0a5
Summary: A pipeline results reporter
Home-page: https://github.com/pepkit/pipestat
Author: Michal Stolarczyk, Nathan Sheffield
License: BSD2
Keywords: project,metadata,bioinformatics,sequencing,ngs,workflow
Classifier: Development Status :: 4 - Beta
Classifier: License :: OSI Approved :: BSD License
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: jsonschema
Requires-Dist: logmuse >=0.2.5
Requires-Dist: oyaml
Requires-Dist: ubiquerg >=0.6.1
Requires-Dist: yacman >=0.9.1
Requires-Dist: PyYAML
Requires-Dist: pandas
Requires-Dist: eido
Requires-Dist: jinja2
Provides-Extra: pipestatreader
Requires-Dist: uvicorn ; extra == 'pipestatreader'

Trying to install pipestatreader via:
pip install pipestat[pipestatreader]>=0.6.0a5 gives no output!

but using pip install pipestat[pipestatreader]==0.6.0a5 successfully installs pipestat reader dependencies.

Next steps:
-Determine why the db-backend is not making it into the whl file.

@nsheff
Copy link
Contributor Author

nsheff commented Nov 9, 2023

is the hyphen causing a problem, by chance? I don't think you can name packages with hyphens, they sometimes cause issues

@nsheff
Copy link
Contributor Author

nsheff commented Nov 9, 2023

you have a bug in your setup.py --

you're setting OPT_DEPENCIES = {} again, after adding db-backend to it.

@nsheff
Copy link
Contributor Author

nsheff commented Nov 9, 2023

even if that's not a bug, that's confusing -- just create the two extras and then add them

donaldcampbelljr added a commit that referenced this issue Nov 9, 2023
@donaldcampbelljr
Copy link
Contributor

Ok, looks like that fix worked for v0.6.0a6 release.

pip install pipestat[db-backend]==v0.6.0a6

Built Wheel Metadata:

Metadata-Version: 2.1
Name: pipestat
Version: 0.6.0a6
Summary: A pipeline results reporter
Home-page: https://github.com/pepkit/pipestat
Author: Michal Stolarczyk, Nathan Sheffield
License: BSD2
Keywords: project,metadata,bioinformatics,sequencing,ngs,workflow
Classifier: Development Status :: 4 - Beta
Classifier: License :: OSI Approved :: BSD License
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: jsonschema
Requires-Dist: logmuse >=0.2.5
Requires-Dist: oyaml
Requires-Dist: ubiquerg >=0.6.1
Requires-Dist: yacman >=0.9.1
Requires-Dist: PyYAML
Requires-Dist: pandas
Requires-Dist: eido
Requires-Dist: jinja2
Provides-Extra: db-backend
Requires-Dist: psycopg2-binary ; extra == 'db-backend'
Requires-Dist: pydantic <2.0.0,>=1.10.7 ; extra == 'db-backend'
Requires-Dist: sqlmodel >=0.0.8 ; extra == 'db-backend'
Requires-Dist: pydantic ; extra == 'db-backend'
Provides-Extra: pipestatreader
Requires-Dist: fastapi ; extra == 'pipestatreader'
Requires-Dist: uvicorn ; extra == 'pipestatreader'

Oh, I forgot to take out the 2nd pydantic, I'll do that for the next round.

donaldcampbelljr added a commit that referenced this issue Dec 22, 2023
* Add get_records to pipestat_manager and add related test #75

* clean up signatures #72

* clean up docstrings

* remove superfluous get_table_name function

* Reduce get_orm and get_model into one function. #71

* Fix CLI to use record-identifier instead of sample-name

* Fix CLI to use pipeline_type #37

* modify get_records to return new structure #75

* fix html report generation using get_records

* add simple CLI test for reporting and retrieving #81

* Pipestat reader (#80)

* update version in prep for pipestat table

* add table function

* add initial stats function from looper, remove counter

* fixing _create_stats_summary and get_file_for_project

* fix sample_level stats reporting

* add object reporting

* remove redundancies

* lint

* add assertion for table generation to pre-existing test

* use pipeline_type when retrieving samples

* Fix stats table generation output to look better

* fix return types

* fix list vs List

* fix typo for stats table

* fix doc strings

* update func names for disambiguation

* fix key error issue in html_reports

* clean up

* adjust docstrings

* update docstrings

Co-authored-by: Nathan Sheffield <nsheff@users.noreply.github.com>

* adjust docstrings, rename html_reports_pipestat.py to reports.py

* adjust LOGGER.info string

* simplify objs and stats generation

* remove old todos

* remove sample_name, add pipeline_name to project field definitions

* work towards using project_name instead of sample_name from project-level pipelines

* fix check record bug, add better error message for pipeline type.

* add checks for pipeline types, lint

* fix get_samples with pipeline type

* fix fetch_pipeline_results

* add basic tests for project_level, fix associated bugs

* allow table creation to use project_name if applicable

* add passing project name to file_backend

* refactor to use r_id as input

* update doc strings

* more refactoring of records vs samples to be more general

* move table generation to reports.py

* allow status file dir path to be associated with config path OR the file path to ensure looper compatibility.

* add output_dir as a parameter for placing reports and stat files. Otherwise defaults to results_file dir or config_file dir

* add obtaining schema_path from pipestat config for Looper compatibility

* added select function to main pipestat class

* added select_txt and select_distinct function to main pipestat class

* added get_one_record function to main pipestat class #70

* Removing CLI req that one must supply config and schema because the user can supply schema within the config. For Looper compatibility.

* Revert "added get_one_record function to main pipestat class #70"

This reverts commit b477060.

* Revert "added select_txt and select_distinct function to main pipestat class"

This reverts commit 4588f4f.

* Revert "added select function to main pipestat class"

This reverts commit ee98145.

* Add basic glossary page in the form of a table. Addresses: pepkit/looper#290

* remove redundant tag

* add RecordNotFoundError across both backends #74

* Add PipelineTypeNotSuppliedError to hide dbbackend information #74

* Initial work towards a fastapi implementation of pipestat reader
#22

* Initial work to separate classes #78

* working implementation of child classes for reporting using DBBACKEND. Manual testing confirmed. pytest Tests broken. #78

* implement report for filebackend

* fix inheritance issue with pipeline_type

* typo

* major refactoring, many tests still broken #78

* Fix signatures for report and explicitly state input arguments.

* remove pipeline type check

* add projectpipestamanager and refactor project_name to record_identifier

* fix get_status

* fix tests

* fix all tests #78

* add todo

* code cleanup and remove getting_table_name

* clean up

* partial removal of pipeline_type logic from DB backend

* remove pipeline_type input arguments from File Backend

* clean up

* add simple class to wrap SamplePipestatManager and ProjectPipestatManager

* change pipestatmanager to mutable mapping #78

* lowercase attributes #78

* polish PipestatBoss #78

* env_vars for pipeline_type

* update docs

* working proof of concept to retrieve results #22

* update readme and add getting output schema

* add else catch for outptu schema and fix typo

* add ability to run with uvicorn if calling reader.py directly.

* add __init__.py and main func

* add more endpoints #22

* update endpoints, add Query

* allow for more complex filtering using post and pydantic models #22

* Add image and file endpoints #22

* lint

* Add basic arg parser to pass absolute path to config file

* Add endpoint for get_records()

* clean up

* modify get_records to return new structure #75

* fix html report generation using get_records

* attempt global creation of psm, does not work

* fix creation of global psm

* change fetching by filetype

* add cli option for pipestat serve

* simplify and fix pipestat serve #22

* add host and port arguments #22

* add config error

* update readme

---------

Co-authored-by: nsheff <nsheff@users.noreply.github.com>

* list to List

* fixed typing in python3.8

* update changelog

* fix table_name_bug

* "0.5.4a1" for alpha release

* "0.6.0a1" for alpha release

* Dev json schema (#87)

* first pass to allow for output_schema as a true JSON schema #85

* remove unused import

* simplify logic, extend _safe_pop_one_mapping to handles multiple keys

* disambiguation key vs keys in parsed schema

* make samples and project objects instead of arrays, add more test assertions, clean up docstrings.

* add status data to string representation of ParsedSchema

* add retrieve_distinct function for #70 and databio/bedhost#75

* add `pipestat link` functionality to create symlinks (#90)

* begin work on file inking via pipestat

* continue implementing link for an output_dir OR the results_file.yaml
#89

* implemented symlinks for filebackend given output_dir
#89

* implemented symlinks for filebackend using results.yaml, polish tests
#89

* fix typo, clean up

* more lcean up

* begin rework to be backend agnostic

* refactor to place in folders specific to a result_identifier

* add more complex types, add temp directory for better testing

* fix recursive finding of paths

* clean up, move to abstract class, confirm works for both backends

* remove unused test files

* add jinja2 to requirements-all.txt #91

* complex example with collision

* add warning

* add cli option

* remove unused functions

* update changelog

* clean up docstrings, remove unused variables, refactor link_dir

* clean up more docstrings

* remove unnecessary print statement to reduce verbosity

* Begin work on adding created and modified datetime, DateTime objects are not parsing correctly, tests broken #93

* fix datetime objects #93

* fix pipeline_name field beginning with underscore #94

* timestamp continuation, list_recent_results does NOT work

* fix doc string for schema

* add filtering by created and modified time #93

* update version for alpha release v0.6.0a2

* fix list reversing #93

* polish tests #93

* add filebackend list_recent_results, remove records broken #93

* fix remove records bug #93

* update doc strings #93

* update change log for list_recent_results

* begin work on retrieve_multiple #96

* add db backend, test broken #96

* fix db backend logic #96

* implement file backend #96

* fix docstrings

* some minor polishing

* update readme

* move pipestat reader per #22

* fix pipestat reader setup and manifest per #22

* Add optional dependencies via setup #22

* pipestat can now run without DB specific dependencies #22

* Some refactoring #64

* add other requirements to test

* refactor to have ParsedSchemaDB inherit from ParsedSchema for code reduction #64

* Clean up

* return total record count #75

* make initialization of file and DB backend into separate functions

* add decorator for checking dependencies

* polishing decorator for checking dependencies

* consolidate decorator and ancillary function into one

* first pass #99, some tests broken

* fix all tests #99

* refactor store to cfg #99

* fix bug with table path

* setitem is now a wrapper for report

* fix getitem implementation, add test #99

* Begin work on select_records, retrieve_one, retrieve_many #103

* Implement select_records, and cursor pagination #103

* clarify init message

* Update dbbackend.py

* add loading JSON and column.contains for filtering #103

* testing differences

* add or_ and and_ sqlmodel operators #103

* some comments

* steps toward smarter pytesting

* #103 -- added multi key filter

* typo

* polish changes to select_records and add tests. #103

* fix typo

* remove Union for python 3.8 compat

* move get_nested_column and define_sqlalchemy_type outside of select_records

* update doc strings

* Modify to be true JSONs chema with using $ref and $def for custom objects (image, file) #85

* Update pipestat/backends/abstract.py

Co-authored-by: Oleksandr <sasha99250@gmail.com>

* Update pipestat/backends/db_backend/dbbackend.py

Co-authored-by: Oleksandr <sasha99250@gmail.com>

* Update pipestat/backends/db_backend/dbbackend.py

Co-authored-by: Oleksandr <sasha99250@gmail.com>

* Update pipestat/pipestat.py

Co-authored-by: Oleksandr <sasha99250@gmail.com>

* Update pipestat/pipestat.py

Co-authored-by: Oleksandr <sasha99250@gmail.com>

* Update tests/data/sample_output_schema.yaml

Co-authored-by: Oleksandr <sasha99250@gmail.com>

* Polish PR, fix docstrings, add wrapper for remove

* remove unnecessary mutablemapping inheritance

* lint

* cleaning

* fixed failing test

* remove old, unused validate_schema function

* change ALL output_schemas to reflect JSON schema

* lint

* per request, remove select function as well as dynamic_filter

* add to do

* revert PipestatManager to a MutableMapping

* remove select_txt and select_distinct

* revert select_distinct for now

* remove unused backend funcs, tests broken

* implement list_recent_results wrapper (DBBackend only)

* implement list_recent_results wrapper (DBBackend only)

* some polish, begin implementing select_records

* more work towards select_records

* change operator func, add list comprehension for sets

* implement AND OR Logic and fix tests, add sorting

* fix time stamp filtering if NOT nested

* lint and fix db test

* attempt to pare down results based on columns, does not work.

* add column filtering for filebackend, add tests, add no filter condition retrieving all records on filebackend

* fix complex object filtering

* implement select_distinct for filebackend

* select_dsitinct now works on record identifiers for filebackend, polish tests

* fix retrieve_one and retrieve_many for filebackend

* fix timestamp test for filebackend

* use isinstance for type checking

* refactored for efficiency

* fix recursive schema after merge

* remove get_records, fix return type for select_records if given columns

* begin remove of get_one_record

* deleted get_one_record function

* fix get_status on dbbackend using select_records

* remove retrieve from dbbackend and reports

* replace retrieve with retrieve_one and begin replacing retrieve with select_records

* lint

* fix select_records filebackend bug, remove retrieve from tests

* finish removing retrieve function

* attempt re-implementing record not found error for retrieve_one and retrieve_many

* fix bugs with select_records and RecordNotFound, implement ColumnNotFound exception

* lint

* update changelog and version for 0.6.0a3 release

* refactoring based on ruff

* update cli and usage and api docs, lint

* update documentation focused on schemas

* add result_identifier as parameter to retrieve_one

* add result_identifier as parameter to retrieve_many

* __version__ = "0.6.0a4"

* Finish adding pytest.mark.skipif

* lint

* move general tests to test_pipestat, remove some redundancy

* implement clear_status pytest for filebackend #110

* polish test set_item, report_overwrite

* add pytest fixture val_dict

* add pytest fixture values_sample, values_project

* add pytest fixture values_complex_linking, move fixtures to conftest.py

* deleted unnecessary print

* add test_select_no_filter_limit

* black formatting

* add pytest fixture for ranges

* lint

* fix record_identifier being returned twice if it is an input column

* fix typo

* change retrieve_one output to no longer we wrapped by select_records output

* fix schema propert to return psm.cfg["_schema"].to_dict()

* fix schema propert to return psm.cfg["_schema"]

* add psm.schema.original_schema to parsed schema

* add r_id = record_identifier or self.record_identifier to retrieve_one

* fix majority of tests due to retrieve_one

* lint

* fix table creation bug with retrieve_one

* remove RecordNotFoundError because it breaks pypiper

* fix parsed_schema json schema parsing to be correct hierarchy

* update jupyter api docs

* update cli docs

* fix property access #113

* add retrieving multiple results using retrieve_one

* list to List

* add quickstart guide

* add psm.schema.resolved_schema

* remove unused function

* version bump to v0.6.0a5

* fix config_path property

* re-implement RecordNotFoundError

* be more selective about which exceptions to pass

* potential solution for #117

* version bump 0.6.0a6

* remove redundant pydantic

* fix pipestat reader dependency checking

* fix cli bug regarding env variables

* retrieve_one record_identifier defaults to None to allow for use with env variables, clean README.md

* further documentation polishing

* fix obtaining record_identifier from cfg file

* correct config.md

* correct pipestat_specification.md

* 0.6.0a7 prerelease

* Clarify return on select_records

* ensure requirements list yacman>= 0.9.2

* potential fix for #119

* change warning to debug for creating status table

* fix pipestat creating subdirectories before creating results file.

* First proof of concept for #120

* simply do replacement at beginning

* Fixed #106

* now functional, create aggregate_results.yaml which replaces self._data before reporting #120

* replace spaces in names with %20 during html link generation

* fix reported_objs bug

* expandpath for link_dir during pipestat link

* add tests for new aggregate_results functionality

* refactor db-backend to dbbackend

* fix stats and objects buttons for summary page #130

* fix stats table on summary page #130

* fix broken status if multi_results_files #130

* fix log and profile files #130

* fix tsv columns #130

* Add check for pipeline type during report generation #130

* refactor Sample to Record #130

* fix refactor for sample page

* fix stats.tsv and objects.yaml generation to incorporate pipeline types

* add project_objects page #130

* minor polishing

* remove temp test

* lint

* add passing looper's samples to pipestat summarize to populate table

* remove unused code

* version bump for pre-release v0.6.0a9

* use "object_type" for image and file types or fall back on using "type", add try except

* ensure project level objects are accurate on summary/index page

* remove project name from stats and object file names

* update changelog

* 0.6.0a10 bump for pre-release

* allow project-level figures to display as columns instead of rows

* 0.6.0a11 pre-release

* v0.6.0 release prep

* lint

---------

Co-authored-by: nsheff <nsheff@users.noreply.github.com>
Co-authored-by: Khoroshevskyi <sasha99250@gmail.com>
@donaldcampbelljr
Copy link
Contributor

Closed via #92

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants