Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/compara datachecks #173

Merged
merged 18 commits into from
Jan 21, 2020
Merged

Feature/compara datachecks #173

merged 18 commits into from
Jan 21, 2020

Conversation

CristiGuijarro
Copy link
Contributor

@CristiGuijarro CristiGuijarro commented Nov 29, 2019

Addition of new critical compara DCs converted from HCs:

  • AlignmentCoordinates
  • CheckComparaStableIDs
  • CheckConservationScore
  • CheckCAFETable
  • CheckConstrainedElementsTable
  • CheckDuplicatedTaxaNames
  • CheckFlatProteinTrees
  • CheckGeneGainLossData
  • CheckFirstLastRelease - (split into CheckReleaseNulls & CheckReleaseConsistency)
  • CheckGenomicAlignGenomeDBs
  • HighConfidence - (from EGHighConfidence)
  • CheckEmptyLeavesTrees
  • CheckSpeciesSetTable
  • CheckSpeciesTreeNodeAttr
  • CheckSynteny
  • MultipleGenomicAlignBlockIds
  • CheckHomology
  • MemberProductionCounts

Addition of new method:

  • is_one_to_many

@coveralls
Copy link

coveralls commented Nov 29, 2019

Pull Request Test Coverage Report for Build 870

  • 16 of 16 (100.0%) changed or added relevant lines in 2 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.01%) to 98.59%

Totals Coverage Status
Change from base Build 868: 0.01%
Covered Lines: 1818
Relevant Lines: 1844

💛 - Coveralls

Copy link
Contributor

@muffato muffato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, good job :) Many DCs are ready to go straight in !

You may have noticed that the coverage has dropped by almost 1%. This is because you don't have unit-tests for the new methods you have added. All the utility functions should be thoroughly. Can you please add some in t/TestDataCheck.t ?

lib/Bio/EnsEMBL/DataCheck/Test/DataCheck.pm Outdated Show resolved Hide resolved
lib/Bio/EnsEMBL/DataCheck/DbCheck.pm Outdated Show resolved Hide resolved
lib/Bio/EnsEMBL/DataCheck/Checks/EGHighConfidence.pm Outdated Show resolved Hide resolved
lib/Bio/EnsEMBL/DataCheck/Checks/AlignmentCoordinates.pm Outdated Show resolved Hide resolved
lib/Bio/EnsEMBL/DataCheck/Checks/CheckEmptyLeavesTrees.pm Outdated Show resolved Hide resolved
@muffato
Copy link
Contributor

muffato commented Dec 6, 2019

Forgot to mention it but 🥇 for documenting the new methods

@CristiGuijarro
Copy link
Contributor Author

CristiGuijarro commented Dec 6, 2019

  • Add is_one_to_many to t/TestDataCheck.t

  • Going to include at least one more DC in this PR (MemberProductionCounts)

Copy link
Contributor

@muffato muffato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something I just spotted because I looked at the changes from the terminal. You have many lines that end with some trailing whitespace. Although we don't enforce it in Perl (we will in Python, though) it is considered better to clean that up. I don't know what @james-monkeyshines' policy is. In this instance, I would suggest to leave the PR but fix it on the lines you need to change to address the comments. On the next PRs though, make sure the trailing whitespace is removed

@CristiGuijarro
Copy link
Contributor Author

I am still having arguments with Atom. I hate the trailing white space, but I can't seem to get Atom to understand without then removing trailing white space when it shouldn't.

Copy link
Contributor

@muffato muffato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(haven't finished checking MemberProductionCounts yet)

t/TestDataCheck.t Outdated Show resolved Hide resolved
lib/Bio/EnsEMBL/DataCheck/Checks/CheckFlatProteinTrees.pm Outdated Show resolved Hide resolved
lib/Bio/EnsEMBL/DataCheck/Checks/MemberProductionCounts.pm Outdated Show resolved Hide resolved
lib/Bio/EnsEMBL/DataCheck/Checks/MemberProductionCounts.pm Outdated Show resolved Hide resolved
…Attr, CheckSynteny and MultipleGenomicAlignBlockIds

Following review from @muffato

Reverse accidental commit
lib/Bio/EnsEMBL/DataCheck/Checks/CheckFlatProteinTrees.pm Outdated Show resolved Hide resolved
lib/Bio/EnsEMBL/DataCheck/Checks/CheckFlatProteinTrees.pm Outdated Show resolved Hide resolved
lib/Bio/EnsEMBL/DataCheck/Checks/CheckFlatProteinTrees.pm Outdated Show resolved Hide resolved
lib/Bio/EnsEMBL/DataCheck/Checks/CheckFlatProteinTrees.pm Outdated Show resolved Hide resolved
lib/Bio/EnsEMBL/DataCheck/Checks/CheckFlatProteinTrees.pm Outdated Show resolved Hide resolved
lib/Bio/EnsEMBL/DataCheck/Checks/MemberProductionCounts.pm Outdated Show resolved Hide resolved
lib/Bio/EnsEMBL/DataCheck/Checks/MemberProductionCounts.pm Outdated Show resolved Hide resolved
lib/Bio/EnsEMBL/DataCheck/Checks/MemberProductionCounts.pm Outdated Show resolved Hide resolved
lib/Bio/EnsEMBL/DataCheck/Checks/MemberProductionCounts.pm Outdated Show resolved Hide resolved
lib/Bio/EnsEMBL/DataCheck/Checks/MemberProductionCounts.pm Outdated Show resolved Hide resolved
Copy link
Contributor

@muffato muffato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I haven't finished MemberProductionCounts and CheckFlatProteinTrees

Copy link
Contributor

@muffato muffato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost done :) 😌

re-update index.json

Compara for travis.

Addressing more review comments from @muffato

Addressing review comments & additional descriptive fixes

Update lib/Bio/EnsEMBL/DataCheck/Checks/MemberProductionCounts.pm

Co-Authored-By: Matthieu Muffato <muffato@ebi.ac.uk>

Apply suggestions from code review

Co-Authored-By: Matthieu Muffato <muffato@ebi.ac.uk>

Changes following review

Addressed more comments in code review

Review address

Review comments addressing

:+1:
@CristiGuijarro
Copy link
Contributor Author

🙏

Copy link
Contributor

@muffato muffato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

Co-Authored-By: Carla Cummins <carlac@ebi.ac.uk>
Copy link
Contributor

@james-monkeyshines james-monkeyshines left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments on a few minor issues, the only one that needs to be fixed before merging is the name of the module in HighConfidence.

@james-monkeyshines james-monkeyshines merged commit 2523ea1 into Ensembl:master Jan 21, 2020
@CristiGuijarro CristiGuijarro deleted the feature/compara_datachecks branch January 23, 2020 08:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants