Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Export breaks with large concept sets #113

Closed
anthonysena opened this issue Jun 27, 2016 · 0 comments
Closed

Export breaks with large concept sets #113

anthonysena opened this issue Jun 27, 2016 · 0 comments
Milestone

Comments

@anthonysena
Copy link
Collaborator

When concept sets have large (>10,000) included concepts, the IN clause in the underlying query fails. We need to add methods to the VocabularyService.java file to take in the ConceptSetExpression and embed these into the existing queries for finding included/mapped codes instead of passing a comma delimited list of concept ids.

@anthonysena anthonysena added this to the v1.2.0 milestone Jun 27, 2016
fdefalco pushed a commit that referenced this issue Sep 26, 2016
* Updated schema migration scripts to specify schema.

Added new application.properties value:
flyway.placeholders.ohdsiSchema=${flyway.placeholders.ohdsiSchema}
spring.jpa.properties.hibernate.default_schema=${datasource.ohdsi.schema}

Manually applying JPA properties in DataAccessConfig.

* sparql

* base8.17

* LAERTES base

* linkoutdata

* added a new test for the linkout data retrieval

* add ? to makefile

* updates to allow for concept set persistence

* OHDSI-49 =>
- move getCohortSummaryData(), getCohortSummaryAnalyses() and
corresponding methods/mapper into CohortResultsService
- remove /sourceKey from path and modify service methods in
CohortAnalysisService
- add getOhdsiSchema() into AbstractDaoService

* adding flyway script

* fix case sensitive issue

* Rename v1.0.1.0__conceptsets.sql to V1.0.1.0__conceptsets.sql

* Support multiple events per person in feasibility Study.

* documentation and job information improvements

* flyway postgresql and oracle migrations for concept sets

* getEvidenceSummaryBySource

* Basic Caching and Record Count Updates

* getsummary

* makefile

* Implement Abnormal filter.

Fixes #46.

* implement getEvidenceDetails

* add id to evidencesummary

* add judgement before get json

* Create getDrugEraTreemap

getDrugEraTreemap returns data for ACHILLES DRUG_ERA summary report, which is currently displayed as treemap and tabular view.  This function has parameter @ConceptIdList, such that the user can restrict to a subset of all records in the table, rather than returning all.

* Rename getDrugEraTreemap to getDrugEraTreemap.sql

* Create getDrugEraPrevalenceByGenderAgeYear.sql

* Create getConditionOccurrenceTreemap.sql

Has parameter @conceptIdList to specify the subset of records you want to come back

* Create getDescendentOfAncestorConcepts.sql

Pararms:    @cdm_schema - set where vocab lives
@id  - the target concept you are starting with
@ancestorVocabularyId - the vocab of the ancestor concepts you are interested in
@ancestorClassId - the concept class of the ancestor concept you are interested in
@siblingVocabularyId - the vocab of the descendant of the ancestor that you want
@siblingClassId - the concept class of the descendant of the ancestor that you want



Example target use case:

if given an RxNorm ingredient concept,  you want to find all other RxNorm ingredient concepts that are descendants of the same ATC3  ancestor(s).

* Adding methods for Penelope functionality

* Fixed cohort cout for the 'nullStudy' case.

Now that we support 'event-level' feasibility studies, the nullStudy case needed to do a simple select count(*) instead of select count(distinct subject_id)

* enabling compression for services

* Person Service Functionality for Profiles / Apollo

* ensure proper table qualifiers are utilized for vocabulary and result tables

* Updates to WebAPI for Penelope

* new export functionality for cohort results

* Fixes for CDMResultsService.java

* no change to V1.0.0.6.2__schema-create_laertes.sql. Created V1.0.0.6.3__schema-create_laertes.sql to drop null constraint

* flyway changes to drop null constraint on ohdsi.drug_hoi_evidence in oracle and sql server (NOT TESTED - Postgres has been tested)

* Descendant concepts for concept list

* export hotfix

removed trailing tab character on cohort result export file

* Additional evidence calls for Penelope

* Penelope WebAPI updates

* Changes for retrieving label evidence

* adding repositories for labels and mappings

* Adding methods for Penelope functionality

* Adding methods for Penelope functionality

* Adding methods for Penelope functionality

* Adding methods for Penelope functionality

* Adding methods for Penelope functionality

* Adding methods for Penelope functionality

* Update getExposureOutcomePredictors.sql

* Adding methods for Penelope functionality

* Adding methods for Penelope functionality

* Adding methods for Penelope functionality

* more transparent error handling to help with debugging

* git formatting?

Conflicts:
	src/main/java/org/ohdsi/webapi/cohortresults/ExposureCohortResult.java
	src/main/java/org/ohdsi/webapi/cohortresults/TimeToEventResult.java
	src/main/java/org/ohdsi/webapi/service/CohortResultsService.java

* git formatting?

* add hoiname,drugname
solve duplication

* Adding methods for Penelope functionality

* Update firstConditionRelativeToIndex.sql

bounded the information that comes back to +/- 1000 days

* Update getTimeToEventDrilldown.sql

bounded graphs to -1000 to 1000

* Update getConceptRecordCount.sql

added measurement counts

* OHDSI-58 => fix issue #71

* Merge origin/Penelope into Penelope

Conflicts:
	src/main/java/org/ohdsi/webapi/evidence/ConceptOfInterestMapping.java

* Flyway scripts multihomed tables for Penelope

* Added support for specifying at last/at most in criteria groups.

* Refactoring table names

* Penelope search enablement

* Flyway script fixes

* added ordering to returned domain and vocabulary lists

* OHDSI-60 => fix Oracle errors in penelope scripts.

* Fix for additional criteria not properly restricting primary events.

* Fixed matching cohort definition from calypso feasibility study.

Switched logic from looking for ANY to looking for NOT ALL.  This covers the case when the additional criteria has 'At Least' or 'At Most' specified for the criteria group.

* Changes to support schema awareness:

Added flyway properties to allow injection of ohdsi schema into migration scripts.
Updated MSSQL migration scripts to specify schema to create tables in.
Modified Hiberanate properties to specify default schema of ${ohdsi_schema}
Removed cdm database and cdm schema from maven properties. These are specified in the source and source_daimon tables.
Removed 'if table exists' checks from migration scripts. Flyway will fail now if a table already exists.  This is better behavior due to confusion that arises when an existing out-of-date table exists in the database makes flyway create the correct table.

* Additional updates related to oracle.

Tested with Oracle.
Removed unnecessary sequences:  oracle and postgres only depend on hibernate_sequence for id generation.

* Use different names for temp tables in runHeraclesAnalyses.sql (fixes #62)

* Specify schema when getting cohort analyses.

* Changed end_time column from timestamp to datetime for MsSQL.

* Added fetch join to cohort definition repository to eliminate n+1 query.

* Fixes #80 - Created a schema variable to jobExecutions.sql to load tables for Jobs

* Initialize collections to an empty list (to avoid null checks)

* Format sql literal dates into DATEFROMPARTS() to enable cross dbms compatability.

Fixes OHDSI/Circe#50.

* Added table qualifier for feas_study tables.

* Added cast() calls to POWER function so that result will be casted into 64 bit integer. This allows us to support up to 62 distinct inclusion rules.

* Set visit type filters to use visit_type_concept_id.

* Revert change to visit_type_concept_id.  Visit_concept_id is the IP/OP concepts, visit_type_concept_id indicates how the visit was derived.

* #68

* #70

* remove MSSQL JDBC dependency from webapi-oracle profile.

* Fixed min value for Oracle Sequence.

* Remove invalid concept relationships when looking up mapped concepts.

Fixes #85

* Update README.md

Added links to installation guide.

* ConceptIDs passed into queries should not be surrounded with quotes.

* Fixed #39

Thanks for pointer, @cataphract

* Force correlated criteria to be within the same observation period as index.

* Added calls to splitSql for oracle support.

Fixes 'invalid character' errors in Oracle with trailing ; in commands.

* LAERTES update changes some evidence table schemas and queries

* updating flyway to support the changes to LAERTES made for this update

* updated WebAPI to work with the updated LAERTES evidence data

* modified some testing files

* minor fixes to evidence service after testing in sql server environment

* Released branch update

* Readme whitespace cleanup

* Concept Set Existance; Penelope Tables

* RSB Integration

* RSB Integration

* OHDSI-70 => fix ORA-14155 issues. (#96)

* heracles script fixes

* Concept Set Source Code Evaluation

* Use 'datasource.ohdsi.schema' as default schema.

* fix for cross-database support, improvements to the person api

* Adjust cohort expression query builder to build series of INSERT INTO (changed from CTAS).

Fixes #99

* Add coalesce to death cause_concept_id.

Fixes #101

* adding refresh route to the source service to allow for a reload of sources from the database without a webapi restart

* Add PostgreSQL migration to create shiro tables.

* Concept Set Export

* Add permission checks on API methods call.

* Update migration scripts for Shiro.

* Concept Set Export

* Added Route to export conceptsets for cohort definition.
Refactored writeToCSVandZip into separate utility function.

* Fixes OHDSI/Atlas#104

Restructured codeset query to fetch all included concepts and left join to all excluded concepts looking for exclusion = null.

* added gender to person api

* Fix for issue #106

Add @ohdsi_database_schema to qualify tables properly

fixes #106

* added list of cohorts a person belongs to to the person api

* cohortbreakdown api for navigation to profiles

* CIRCE Inclusion Rule Support

Cohort generation with inclusion rules
Added new flyway migration for cohort_feasibility tables.
Added new route to retrieve cohort inclusion rule impact.
New route: cohortdefinition/{id}/report/{sourcekey}

* getCohortBreakdownPeople working

* bugfixes. using 'set rowcount' instead of row_number(), which is not
translatable, but a lot faster

* went back to row_number for translatability. kludge makes it faster most
of the time

* Add bearer token auth based on JSON Web Token.

* throwing exception when can't find person. not great.

* got rid of String.join(), it was causing problems for pre Java 8

* Closes #113. Fix large concept set export

* fixed errors that popped up after translating to PostgreSQL

* fixed person api to work on postgres. (did this yesterday but it didn't
work. I now have a testing platform for postgres)

* Exclude stats generation SQL when definition SQL is generated via /sql route.
Added @generateStats flag to generateCohort.sql.

* Add method to register windows user.

* Fixes OHDSI/Atlas#173
Drop #cohort_candidate table after cohort generation.

.

* Fix NPE in FeasiblityService.

generate_stats and results_database_schema were not being passed as a job parameter, and causing a NPE.

* Use sequences as generator for ID columns in Shiro tables. Add default role within migration.

* Fixes OHDSI/Atlas#182

Concept needs to include all properties when serializing JSON.

* Inclusion Rule Report must return results in inclusion rule sequence order.

* Add abstraction to be able to turn SHIRO security off.

* Add NTLM Authentication.

* Add logout method to invalidate access token.

* Add authorizing filter.

* Add expiration time for tokens.

* Fix invalidation of empty token.

* Sort source service /sources result by SourceKey.

* Fixes OHDSI/Atlas#198

* Renamed classes involved in visitor pattern to dispatch pattern for getCriteriaSql().

* Implemented cohort definition enhancments:
1: Added new limit filter: "Qualified Events" to limit the events per person after additional criteria are applied to primary evetns.
2: Added 'End Date Strategies':  A strategy is used to determine how an end date of a cohort should be applied.
- 2 implementations of end date strategies implemented: DateOffset and CustomEra.
3: Default event limit type to 'First'

* Comparitive Cohort Analysis Implementation
New webservices to support comparative cohort analysis through psmodel.
- Added models for attrition, balance, and distribution data.

Removed flyway-sources migration folders from branch.

* Fixing thread safety issue in resolveConceptSetExpression.

* Adding device, procedure and specimen domains to the patient profile data returned.  This makes progress on #170.

* Added migrations for CCA for postgresql and oracle.

* Bug fix - cohort person count should read from results schema

* fix result schema issues

* Fixes #121

Do not apply QualifiedLimitFilter when additional criteria is null.

* Add API methods to manage roles and permissions. Add permission checks.

* Distinguish create and update methods in 'ConceptSetService'.

* Remove obsolete permission checks.

* Hide user service when 'DisabledSecurity' is used.

* Fixes #122.

Truncate/drop #cohort_ends temp table.

* Fixes. Simplify 'UserService'.

* Fix IDs generation.

* Initialize security data within flyway migrations.

* Add API methods to manage roles and permissions.

* added missing table qualifier

* Add/remove appropriate permissions on creating/deleting cohort definition.

* Allow vocabulary search.

* Update rollback scripts.

* Sort output of user service.

* CORS support. Fixes.

* Fix filter chain definitions.

* Put user's permissioins into bearer token.

* Discard changes in 'ConceptSetService'.

* Allow read concept sets to all users.

* Change delimiter for permissions in token.

* Demographic Criteria Implementation.

* Removed BETWEEN usage in criteria

A BETWEEN X and Y becomes A >= X and A <= Y.  BETWEEN is not consistent between all db platforms.

* Incidence Rate Analysis Implementation

Implementation of Incidence Rate web services, tasklets and repositories and IR Analysis Report Generation.

Added new dependency: apache commons collections v4.1.

* Remove BETWEEN expression from numeric and date range criteria input types.

* Add null check for correlated criteria end date.

Existing cohorts will have this null, so a null check is required.

* enhancements for person profile service and cohort comparison service

* sql syntax error

* Add permissions into cohort roles.
Put new permissions for cohort created with 'copy' method.

* Fix to allow more than 1 person per cohort.

* Flyway deployment scripts for postgreSQL and Oracle.

* adding outcome model retrieval and updates to comparative cohort analysis

* Negative controls implementation
	modified:   src/test/java/org/ohdsi/webapi/test/feasibility/StudyInfoTest.java

* ConceptSet optimization and comparison utilities

* ConceptSet utilties

* Moving GenerationStatus enum

* Delete concept sets per OHDSI/Atlas#87

* Fixing ir calc flyway script for postgreSql

* sql cleanup

* Flyway fixes for sql server

* Allow everithing what is not forbidden.

* Add/remove appropriate permissions on creating/deleting a concept set.

* Add/remove permissions on creating/deleting a role. Only author of a role can change it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant