Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(GSoC) Generic Assay Categorical/Binary data implementation Backend #10303

Merged
merged 41 commits into from
Aug 18, 2023

Conversation

Djokovic0311
Copy link
Collaborator

Changes proposed

  • Implement new Services and Controller to provide generic assay categorical and binary data enrichments for group comparision.
  • Implement Fisher's exact test for binary data and the chi-square test for categorical data.
  • Add tests for Services.

Display

Binary data visualisation:
Binary
Categorical data visualisation:
Categorical

Copy link
Member

@dippindots dippindots left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Djokovic0311, your code looks good to me. Good job!! I think most of my comments are related to the refactoring and some code styles. We can talk about this tomorrow. I still need to figure out why the tests are not passing.

@RestController
@Validated
@Api(tags = "Generic Assay Binary Data", description = " ")
public class GenericAssayBinaryDataController {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's merge this controller and the other controller together and rename it GenericAssayEnrichmentController



@PreAuthorize("hasPermission(#involvedCancerStudies, 'Collection<CancerStudyId>', T(org.cbioportal.utils.security.AccessLevel).READ)")
@RequestMapping(value = "/generic-assay-enrichments/binary/fetch",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's try to use similar naming structure with other endpoints

Suggested change
@RequestMapping(value = "/generic-assay-enrichments/binary/fetch",
@RequestMapping(value = "/generic-assay-binary-enrichments/fetch",

Comment on lines 104 to 105
public static final String GENERIC_ASSAY_CATEGORICAL_ENRICHMENT_FETCH_PATH = "/generic-assay-enrichments/categorical/fetch";
public static final String GENERIC_ASSAY_BINARY_ENRICHMENT_FETCH_PATH = "/generic-assay-enrichments/binary/fetch";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two also need to be changed accordingly.

private void validateMolecularProfile(MolecularProfile molecularProfile,
List<MolecularProfile.MolecularAlterationType> validMolecularAlterationTypes) throws MolecularProfileNotFoundException {
if (!validMolecularAlterationTypes.contains(molecularProfile.getMolecularAlterationType())) {
throw new MolecularProfileNotFoundException(molecularProfile.getStableId());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also check if a molecularProfile has the correct datatype (e.g. BINARY for binary method and CATEGORICAL for categorical method)

});

// Extract pValues and calculate qValues.
BigDecimal[] pValues = genericAssayBinaryEnrichments.stream().map(a -> a.getpValue()).toArray(BigDecimal[]::new);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to refer its actual type here

Suggested change
BigDecimal[] pValues = genericAssayBinaryEnrichments.stream().map(a -> a.getpValue()).toArray(BigDecimal[]::new);
BigDecimal[] pValues = genericAssayBinaryEnrichments.stream().map(GenericAssayBinaryEnrichment ::getpValue).toArray(BigDecimal[]::new);

import java.util.stream.Collectors;

@Service
public class GenericAssayCategoricalDataServiceImpl implements GenericAssayCategoricalDataService {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see shared comments in GenericAssayCategoricalDataServiceImpl.java


@Spy
@InjectMocks
private ExpressionEnrichmentUtil expressionEnrichmentUtil;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems this is not being used anywhere, I know this is probably coming from our example test, but since we found it, let's remove it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. I will have a look at your comments. Thanks for your efforts! @dippindots

@dippindots
Copy link
Member

@Djokovic0311 There is one update for fisher tests we just merged (this is why we see the conflicts in the service/src/main/java/org/cbioportal/service/util/FisherExactTestCalculator.java). The change was to use a two-sided fisher test instead of a one-sided one. We should also update the fisher method we used. See #10288 for more details.

@Djokovic0311 Djokovic0311 changed the title Generic Assay Categorical/Binary data implementation (GSoC) Generic Assay Categorical/Binary data implementation Aug 8, 2023
Copy link
Contributor

@JREastonMarks JREastonMarks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, this looks good but I would like to see inline comments describing some of the logic. In parts of cBioPortal we have some complicated data manipulations that we perform and without good documentation it is hard for the next person to work on that code.

if (!validMolecularAlterationTypes.contains(molecularProfile.getMolecularAlterationType())) {
throw new MolecularProfileNotFoundException(molecularProfile.getStableId());
}
if(isBinary) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the nested if statements? Is the second one needed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two types of generic assay data calling this validation, so the second one is used to verify all the data fetched by getGenericAssayBinaryEnrichments are Binary type, not Categorical, Limited Value or null. This will ensure frontend display the right things.

import java.util.Arrays;
import java.util.List;
import java.util.Map;
import java.util.*;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No * imports please

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, done. Thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the * made it back in

@Djokovic0311
Copy link
Collaborator Author

@JREastonMarks Thanks for all your comments reminding me about the inline comments issue. I previously planned to put the instructions in my final report documentation which is ongoing now. At the same time, I will put them in my codes in next commit.

@Djokovic0311 Djokovic0311 changed the title (GSoC) Generic Assay Categorical/Binary data implementation (GSoC) Generic Assay Categorical/Binary data implementation Backend Aug 15, 2023
Copy link
Contributor

@JREastonMarks JREastonMarks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Djokovic0311 Sorry, found a few more * imports.

@@ -0,0 +1,261 @@
package org.cbioportal.service.impl;

import org.cbioportal.model.*;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove the * import. Sorry, I caught another one.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I noticed the ExpressionEnrichmentTest has the * import, so I just left it here aligned with them.Now I find other test files avoid import *. I have pushed modified version. Shall I modify ExpressionEnrichmentTest as well?

import org.springframework.http.ResponseEntity;
import org.springframework.security.access.prepost.PreAuthorize;
import org.springframework.validation.annotation.Validated;
import org.springframework.web.bind.annotation.*;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove the * import

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, done.

@sonarcloud
Copy link

sonarcloud bot commented Aug 17, 2023

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 17 Code Smells

0.0% 0.0% Coverage
1.6% 1.6% Duplication

warning The version of Java (11.0.20) you have used to run this analysis is deprecated and we will stop accepting it soon. Please update to at least Java 17.
Read more here

@dippindots
Copy link
Member

@Djokovic0311 Just confirmed with @JREastonMarks that we are good to go, thanks @JREastonMarks for helping us review the code. The local e2e tests are not related, I will squash your commits into one and merge it. Good job @Djokovic0311 !!!

@dippindots dippindots merged commit 901406f into cBioPortal:master Aug 18, 2023
13 of 14 checks passed
JREastonMarks added a commit that referenced this pull request Aug 22, 2023
* fix #9461 - mutations api endpoint
- modified java files to remove unused properties and changes to tests
- db changes, mutationmapper.xml
- Update seed_mini.sql

roll back changes for refseqMrnaId

roll back changes for refseqMrnaId

Update ExtendedMutation.java

Added required property setOncotatorRefseqMrnaId and getOncotatorRefseqMrnaId

* Updated more files for removed fields

Updated more files for removed fields

Remove unused oncotator fields

Remove unused oncotator fields
ONCOTATOR_DBSNP_RS
ONCOTATOR_UNIPROT_ENTRY_NAME

remove fields from sql file

ONCOTATOR_DBSNP_RS and ONCOTATOR_UNIPROT_ENTRY_NAME removal from sql file

* db changes and renaming the column prefixes from oncotator to mutation

db changes in sql for renaming columns
renamed the column prefixes from oncotator to mutation on multiple classes and files

* Update SQL statements

Update testSql.sql

updating testSql for the name change of columns from ONCOTATOR to MUTATION

Update DaoMutation.java

The data files still use the old column prefix of ONCOTATOR instead of MUTATION

Update DaoMutation.java

Update migration.sql

Adding quotes around column names

* db changes and renaming the column prefixes from oncotator to mutation

db changes in sql for renaming columns
renamed the column prefixes from oncotator to mutation on multiple classes and files

* Update migration.sql

Update migration.sql

fix syntax errors in mysql change column

Update migration.sql

fix syntax errors in mysql change column

setting db version

setting db version

* Removed Link columns

Removed Link columns and modified the documentation

* Update news.md

* update news.md

* update cosmic importer for new cosmic datafiles

* Update DB version and renaming mutation_event fields

* Update Mutation model

* Update SQL to fix incompatibility with sqlmode OnlyFullGroupBy and add support for multiple database impls to support mysql and h2

* Allow binning and filtering of custom (numerical and categorical) data

* Several small fixes and cleanup after feedback @dippindots

* Remove excess whitespace

* Fix combined Entrez Gene Id and Stuct Var Query in StructuralVariantMapper.xml

* update bilkent members

* Frontend v5.3.0

* Update ClinicalEventMyBatisRepositoryTest.java

We need to update this test because it's not java 8 compatible.

* Address java8 compatible issues in more tests

* Add importer and validator section to release-drafter.yml

* Fix release-drafter.yml error

* Update news.md

* Frontend v5.3.1

* Update News.md

* Use custom driver annotations in sv counts

* Test custom driver annotations filter in sv count query

* Differentiate between test svs by making annotations unique

* Fix sv whereSite1IsNot2 clause, add es0 sv custom driver data

* Remove unused DriverManagerDataSource fields from db tests

* Add comment to mybatis sv whereSite1IsNot2 clause

* Fix validation result_report.html: include new sv

* Frontend v5.3.2

* Update News.md

* fixed swagger page validation errors

* Frontend v5.3.3

* Add MSK database migration documentation (#10115)

Co-authored-by: Ino de Bruijn <ino@ino.bio>

* Do not call next study view filter when no samples left (#10124)

* Add studyId to CancerStudyTag entity

* Frontend v5.3.4

* 🔧 Add Configuration Properties for study view summary limits

* add sitemaps for retype

adding url should fix this

* Frontend v5.3.5

* Add tutorials from others

* Add Farhan Haq tutorial video

* 🔧 Add StudyDownloadUrl to portal.properties

* Frontend v5.3.6

* Update News-Genie.md

* Fix study tag search when no studies present

* Add dsnparse Python requirement

Needed for future PR concerning using db.connection_string to connect to the database

* add jeremy to about page

* Documentation markdown fixes (#10186)

* Fix broken hyperlink to retype
* Move development files under development directory
* Fix markdown links for table of contents

* Frontend v5.3.7

* Update gene set version and update tests

* Fix integration test

* Frontend v5.3.8

* 📚 Update Priority Documentation

* ✨ Add ability to set priority to -1 for clinical attr

* add statistical tests in FAQ

Fixed #10198

* extend File-Formats.md with mutational signature datatype (#10172)

* Frontend v5.3.9

* change docker compose cmds

* fix contact email in dev section

* fix release notes docs link

* fix release procedure info

* Frontend v5.3.10

* Add note about official releases

* Add AACR GENIE BPC tutorial videos (#10215)

* Adding new features news (#10149)

* Adding new features news from 2022/05 to current

Co-authored-by: Ino de Bruijn <ino@ino.bio>

* Update web-API-and-Clients.md (#10209)

Add description of cbioportalR in web API section

* Frontend v5.3.11

* Paginate clinical table response using Page and Pagable spring interfaces

* Use PaginatedClinicalData

* Add query to count items in clinical table

* Replace spring page with page response headers

* Fix page size when sorting clinical table

* Sort clinical table by sample and patient ID

* Fix distinct in patient attr sorting

* Fix difference h2 and mysql in sorting nulls last

* Rename *SampleClinicalTable* methods

* update members about us page

add calla & bryan, move several people to alumni

* Revert controler rename to keep backward compatible

* Impl. Struct Var count endpoint for Study View

* Remove unused imports

PR feedback Gaofei

* Correct variable name

PR feedback Gaofei

* Change HashMap to Set

PR feedback Gaofei

* Return counts when gene 1 or gene2 is null

* Change missing site1 or site2 gene for SV to INFO log level

Was WARN before.

* Update result report

* Create property to show reference genome in study list

* Frontend v5.3.12

* fix contributor last name

* Frontend v5.3.13

* Impl. Struct Var filter queries for /filtered-samples/fetch endpoint

* Add unit tests for resolveEntrezGeneIds

* Changes after PR reviews

- Rename StructVar* to StructuralVariant*
- Revert JsonInclude Always to NON_NULL
-

* Add Java 11 as requirement

* modify documentation for skin.hide_download_controls

* Improve treatment data api performance

* Improve study view treatment api performance

* Update download control options (#10264)

* Add more tests and comments

* fix 10221 - 403 issue

fix 10221 - 403 issue; swagger annotations cause issue in authentication object. If user authorization is enabled, authentication object is obtained from SecurityContextHolder

* Update Pyyaml package

See yaml/pyyaml#724 for details of this update

* Frontend v5.3.14

* Changes related to download group

* Allow ProfiledCasesCounter to add counts to empty alteration count list (#10255)

* Update about us page

* Frontend v5.3.15

* Cache treatment endpoints (#10282)

* update one-sided fisher test to two-sided

* Allow redis service unavailable when Redis is enabled

* Frontend v5.3.16

* Update two-sided fisher tests news (#10312)

* Add two-sided fisher tests news

---------

Co-authored-by: Ino de Bruijn <ino@ino.bio>

* Genomic data counts study-view endpoint (#10300)

* implement getting CNA types for a gene specific

* add tests for genomic-data-counts endpoint

* change CopyNumberDataCounterFilter to StudyViewFilter and GenomicDataCountFilter

* change to molecularDataService to fetch data

* add filter logic for DISCRETE molecular profiles

* Refactor

* count NA and clean up codes

* Updates

* Update StudyViewServiceImplTest.java

---------

Co-authored-by: Qi-Xuan Lu <qlu@carisls.com>
Co-authored-by: Karthik <kalletlakarthik@gmail.com>

* Frontend v5.3.17

* Add defaults for genomic evolution tab in patient view (#10309)

* Add defaults for mutation heatmap and line chart checkboxes

* Update text

---------

Co-authored-by: Gaofei Zhao <15748980+dippindots@users.noreply.github.com>

* Add data access token user role filter feature (#10315)

* Add data access token user role filter feature
* Update and more tests
* Update failing test

---------

Co-authored-by: Jeremy R. Easton-Marks <j.r.eastonmarks@gmail.com>

* Update About-Us.md (#10196)

Update About-Us.md

* Add label check test for pull request

* Frontend v5.3.18

* add info about adding labels

* add property show_web_tours (#10333)

* (GSoC) Generic Assay Categorical/Binary data implementation Backend (#10303)

* update backend

* finish tests

* modify controller file structure

* modify import

* update backend

* finish tests

* modify controller file structure

* refactor some variables

* modify enrichment compare and validate issue

* refactor the controllers and services

* refactor test

* Update GenericAssayEnrichmentServiceImpl.java

* fixed simplification issues

* add inline comments

* modify import

* solve core test issue

* test check

* modify expressionenrichment

* modify imports

* update backend

* finish tests

* modify controller file structure

* modify import

* update backend

* finish tests

* modify controller file structure

* refactor some variables

* modify enrichment compare and validate issue

* refactor the controllers and services

* refactor test

* Update GenericAssayEnrichmentServiceImpl.java

* fixed simplification issues

* add inline comments

* modify import

* solve core test issue

* test check

* modify expressionenrichment

* modify imports

* update imports

* Update GenericAssayEnrichment.java

---------

Co-authored-by: Prasanna Kumar Jagannathan <37613906+jagnathan@users.noreply.github.com>
Co-authored-by: sbabyanusha <satravab@mskcc.org>
Co-authored-by: matthijspon <matthijs@thehyve.nl>
Co-authored-by: Gaofei <15748980+dippindots@users.noreply.github.com>
Co-authored-by: haynescd <haynescd@chop.edu>
Co-authored-by: Charles Haynes <33608920+haynescd@users.noreply.github.com>
Co-authored-by: Pim van Nierop <pim@thehyve.nl>
Co-authored-by: Bas Leenknegt <basleenknegt@thehyve.nl>
Co-authored-by: Ino de Bruijn <ino@ino.bio>
Co-authored-by: sbabyanusha <61431648+sbabyanusha@users.noreply.github.com>
Co-authored-by: John Konecny <jfkonecn@gmail.com>
Co-authored-by: alisman <lisman.aaron@gmail.com>
Co-authored-by: ritikakundra <kundrar@mskcc.org>
Co-authored-by: Tali Mazor <tmazor@users.noreply.github.com>
Co-authored-by: Justin Jao <47490085+justinjao@users.noreply.github.com>
Co-authored-by: JJ Gao <jianjiong.gao@gmail.com>
Co-authored-by: Matthijs Pon <63122826+MatthijsPon@users.noreply.github.com>
Co-authored-by: karissawhiting <karissa.whiting@gmail.com>
Co-authored-by: oplantalech <oleguer@thehyve.nl>
Co-authored-by: Onur Sumer <s.onur.sumer@gmail.com>
Co-authored-by: Qi-Xuan Lu <qlu@carisls.com>
Co-authored-by: Karthik <kalletlakarthik@gmail.com>
Co-authored-by: Bryan Lai <laib1@mskcc.org>
Co-authored-by: qi-xuan.lu <h164654156465@outlook.com>
Co-authored-by: Beking0912 <37431792+Beking0912@users.noreply.github.com>
Co-authored-by: Jiahang Li <55948986+Djokovic0311@users.noreply.github.com>
jagnathan pushed a commit to pughlab/cbioportal that referenced this pull request Nov 8, 2023
…BioPortal#10303)

* update backend

* finish tests

* modify controller file structure

* modify import

* update backend

* finish tests

* modify controller file structure

* refactor some variables

* modify enrichment compare and validate issue

* refactor the controllers and services

* refactor test

* Update GenericAssayEnrichmentServiceImpl.java

* fixed simplification issues

* add inline comments

* modify import

* solve core test issue

* test check

* modify expressionenrichment

* modify imports

* update backend

* finish tests

* modify controller file structure

* modify import

* update backend

* finish tests

* modify controller file structure

* refactor some variables

* modify enrichment compare and validate issue

* refactor the controllers and services

* refactor test

* Update GenericAssayEnrichmentServiceImpl.java

* fixed simplification issues

* add inline comments

* modify import

* solve core test issue

* test check

* modify expressionenrichment

* modify imports

* update imports

* Update GenericAssayEnrichment.java
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants