Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GC content std.dev. limit and coverage std. dev. limit Binning metrics #120

Merged
merged 15 commits into from
Mar 16, 2021
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
153 changes: 139 additions & 14 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,23 +1,148 @@
hello:
@echo "Please inspect Makefile for list of commands"
.PHONY: clean black create_environment install image docs clean unit_test unit_test_data unit_test_data_download unit_test_data_build unit_test_wip unit_test_entrypoints

#################################################################################
# GLOBALS #
#################################################################################

PROJECT_DIR := $(shell dirname $(realpath $(lastword $(MAKEFILE_LIST))))
PROJECT_NAME = autometa
PYTHON_INTERPRETER = python3
# This was retrieved from https://drive.google.com/file/d/1bSlPldaq3C6Cf9Y5Rm7iwtUDcjxAaeEk/view?usp=sharing
TEST_DATA_FILEID = 1bSlPldaq3C6Cf9Y5Rm7iwtUDcjxAaeEk

ifeq (,$(shell which conda))
HAS_CONDA=False
else
HAS_CONDA=True
endif

#################################################################################
# COMMANDS #
#################################################################################

## Delete all compiled Python files
clean:
rm -rf htmlcov && make clean -C docs
find . -type f -name "*.py[co]" -delete
find . -type d -name "__pycache__" -delete

## Apply black formatting
black:
black --exclude autometa/validation autometa

## Set up python interpreter environment
create_environment:
ifeq (True,$(HAS_CONDA))
@echo ">>> Detected conda, creating conda environment."
ifeq (3,$(findstring 3,$(PYTHON_INTERPRETER)))
conda create --name $(PROJECT_NAME) python=3
else
conda create --name $(PROJECT_NAME) python=2.7
endif
@echo ">>> New conda env created. Activate with:\nsource activate $(PROJECT_NAME)"
else
$(PYTHON_INTERPRETER) -m pip install -q virtualenv virtualenvwrapper
@echo ">>> Installing virtualenvwrapper if not already installed.\nMake sure the following lines are in shell startup file\n\
export WORKON_HOME=$$HOME/.virtualenvs\nexport PROJECT_HOME=$$HOME/Devel\nsource /usr/local/bin/virtualenvwrapper.sh\n"
@bash -c "source `which virtualenvwrapper.sh`;mkvirtualenv $(PROJECT_NAME) --python=$(PYTHON_INTERPRETER)"
@echo ">>> New virtualenv created. Activate with:\nworkon $(PROJECT_NAME)"
endif

#################################################################################
# PROJECT RULES #
#################################################################################

## Install autometa from source
install:
python setup.py install
$(PYTHON_INTERPRETER) setup.py install

## Build docker image from Dockerfile (auto-taggged as jason-c-kwan/autometa:<current-branch>)
image: Dockerfile
docker build . -t jason-c-kwan/autometa:`git branch --show-current`

## Build documentation for autometa.readthedocs.io
docs:
make clean html -C docs && open docs/build/html/index.html
make clean html -C docs
@echo "docs built. Open docs/build/html/index.html to view"

clean:
@echo "Removing everything under 'htmlcov'..."
@rm -rf htmlcov && make clean -C docs
## Download test_data.json for unit testing
unit_test_data_download:
gdown --id $(TEST_DATA_FILEID) -O tests/data/test_data.json

## Build test_data.json file for unit testing (requires all files from https://drive.google.com/open?id=189C6do0Xw-X813gspsafR9r8m-YfbhTS be downloaded into tests/data/)
unit_test_data_build: tests/data/records.fna
Comment on lines +79 to +80
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I tried make image which worked on my mac, then in the image, after doing apt-get install build-essential ran make unit_test_data_download (which worked), I tried make unit_test_data_build, and I got this:

make: *** No rule to make target 'tests/data/records.fna', needed by 'unit_test_data_build'.  Stop.

Copy link
Collaborator Author

@evanroyrees evanroyrees Mar 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


Did you try make unit_test?

make unit_test_data_build is the first command to use if you were trying to rebuild the test_data.json object. This will aggregate all of the files found in the path linked to google drive in the docs.


Perhaps for documentation
@Sidduppal


Building Autometa's unit tests

Relevant Makefile Commands

  • unit_test_data_build
  • unit_test_data_download
  • unit_test_wip
  • unit_test_entrypoints
  • unit_test

unit_test_data_build

This command is used to build the test_data.json file for unit testing. I.e. it will run the script make_test_data.py which will aggregate all of the files in the tests/data folder (Note: the required files have been placed here https://drive.google.com/open?id=189C6do0Xw-X813gspsafR9r8m-YfbhTS and should be downloaded into tests/data/). This is the first or perhaps 0th step when it comes to running the tests as it generates the test_data.json file that is parsed to retrieve all of the pre-generated variables used for intermediate stages of the pipeline. This is done to reduce the test time and computational workload when running through the test suite.

As an example, running diamond BLAST is a stage in the pipeline, but we should not be testing whether diamond works, so we simply need to retrieve a "pre-diamond blasted table" and ensure our parser as well as subsequent steps are functioning appropriately. Therefore, the table is written into test_data.json and retrieved when the parser unit test is conducted. Decisions at this stage are somewhat fundamental and may have unintended side-effects across the rest of the test suite.

unit_test_data_download

This command downloads the test_data.json object that was generated from unit_test_data_build and then uploaded to the autometa_test_data google drive. This provides convenience for someone wanting to run the test suite that requires the test variables constructed from the previous command. This is a necessary step when wanting to run unit tests as the test_data.json file will hold many of the variables necessary to conduct these tests.

unit_test_wip

This command runs unit tests but only includes tests marked as work-in-progress (WIP). This is denoted in pytest with the decorator:

@pytest.mark.wip
def test_some_function(...):
    ...

unit_test_entrypoints

This command runs unit tests but only includes tests marked as entrypoints. This is denoted in pytest with the decorator:

@pytest.mark.entrypoint
def test_some_function_that_is_an_entrypoint(...):
    ...

Entrypoints are basically the commands we will list out as 'console scripts' in setup.py that will be available as commands to the end user. These are important and sometimes referred to as "happy" tests because if one of these fail for the end-user, they will probably be quite unhappy and likely distrust the functionality of the rest of the codebase.

unit_test

This command runs all unit tests under the tests directory. This includes all tests marked as WIP or as entrypoints. However this will skip tests marked with the decorator:

@pytest.mark.skip
def test_some_function(...):
    ...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the above description, it is unclear whether you need to run make unit_test_data_build as well as make unit_test_data_download. I just pulled this PR, ran make image, then in the image successfully ran make unit_test_data_download. After that, make unit_test_data_build gives me:

make: *** No rule to make target 'tests/data/records.fna', needed by 'unit_test_data_build'.  Stop.

make unit_test_wip gives me:

python3 -m pytest -m "wip" --durations=0 --cov=autometa --emoji --cov-report html
ERROR: usage: __main__.py [options] [file_or_dir] [file_or_dir] [...]
__main__.py: error: unrecognized arguments: --variables --cov=autometa --emoji --cov-report html
  inifile: /pytest.ini
  rootdir: /

make: *** [Makefile:85: unit_test_wip] Error 4

Then make unit_test_entrypoints gives me:

python3 -m pytest -m "entrypoint" --durations=0 --cov=autometa --emoji --cov-report html
ERROR: usage: __main__.py [options] [file_or_dir] [file_or_dir] [...]
__main__.py: error: unrecognized arguments: --variables --cov=autometa --emoji --cov-report html
  inifile: /pytest.ini
  rootdir: /

make: *** [Makefile:89: unit_test_entrypoints] Error 4

I also get the same thing with make unit_test. Seems like there is something wrong with __main__.py? Not sure what the make unit_test_data_build error is all about.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pytest-variables and pytest-cov need to be installed. I’ve simplified this a bit more within the Makefile by adding the command: test_environment. Now if you run make unit_test (or any of these that have test_environment as a dependency) it will install any requirements if they have not been installed prior to running the unit tests

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I re-pulled, and then deleted my old docker image and re-built it. I ran make test_environment, which seemed to work fine, then make unit_test_data_download, which also worked. But then when I tried make unit_test_data_build, I still get:

make: *** No rule to make target 'tests/data/records.fna', needed by 'unit_test_data_build'.  Stop.

I now get a bit further with make unit_test_wip. I get:

...
============================================================================================================ test session starts =============================================================================================================
platform linux -- Python 3.8.5, pytest-6.2.2, py-1.10.0, pluggy-0.13.1
rootdir: /, configfile: pytest.ini
plugins: html-3.1.1, metadata-1.11.0, md-0.2.0, emoji-0.2.0, repeat-0.9.1, cov-2.11.1, variables-1.9.0
collected 0 items / 1 error

=================================================================================================================== ERRORS ===================================================================================================================
_______________________________________________________________________________________________________ ERROR collecting test session ________________________________________________________________________________________________________
opt/conda/lib/python3.8/site-packages/_pytest/runner.py:311: in from_call
    result: Optional[TResult] = func()
opt/conda/lib/python3.8/site-packages/_pytest/runner.py:341: in <lambda>
    call = CallInfo.from_call(lambda: list(collector.collect()), "collect")
opt/conda/lib/python3.8/site-packages/_pytest/main.py:710: in collect
    for x in self._collectfile(path):
opt/conda/lib/python3.8/site-packages/_pytest/main.py:546: in _collectfile
    assert (
E   AssertionError: local('/dev/console') is not a file (isdir=False, exists=True, islink=False)

----------- coverage: platform linux, python 3.8.5-final-0 -----------
Coverage HTML written to dir htmlcov

========================================================================================================== short test summary info ===========================================================================================================
ERROR  - AssertionError: local('/dev/console') is not a file (isdir=False, exists=True, islink=False)
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
============================================================================================================== 1 error in 0.82s ==============================================================================================================
make: *** [Makefile:89: unit_test_wip] Error 2

I get a similar thing for make unit_test_entrypoints:

============================================================================================================ test session starts =============================================================================================================
platform linux -- Python 3.8.5, pytest-6.2.2, py-1.10.0, pluggy-0.13.1
rootdir: /, configfile: pytest.ini
plugins: html-3.1.1, metadata-1.11.0, md-0.2.0, emoji-0.2.0, repeat-0.9.1, cov-2.11.1, variables-1.9.0
collected 0 items / 1 error

=================================================================================================================== ERRORS ===================================================================================================================
_______________________________________________________________________________________________________ ERROR collecting test session ________________________________________________________________________________________________________
opt/conda/lib/python3.8/site-packages/_pytest/runner.py:311: in from_call
    result: Optional[TResult] = func()
opt/conda/lib/python3.8/site-packages/_pytest/runner.py:341: in <lambda>
    call = CallInfo.from_call(lambda: list(collector.collect()), "collect")
opt/conda/lib/python3.8/site-packages/_pytest/main.py:710: in collect
    for x in self._collectfile(path):
opt/conda/lib/python3.8/site-packages/_pytest/main.py:546: in _collectfile
    assert (
E   AssertionError: local('/dev/console') is not a file (isdir=False, exists=True, islink=False)

----------- coverage: platform linux, python 3.8.5-final-0 -----------
Coverage HTML written to dir htmlcov

========================================================================================================== short test summary info ===========================================================================================================
ERROR  - AssertionError: local('/dev/console') is not a file (isdir=False, exists=True, islink=False)
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
============================================================================================================== 1 error in 0.50s ==============================================================================================================
make: *** [Makefile:93: unit_test_entrypoints] Error 2

This StackOverflow seems to suggest the last two problems are something to do with setting the working directory correctly. Really not sure what is going on with make unit_test_data_build.

https://stackoverflow.com/questions/65394782/pytest-not-working-with-django-and-docker-assertionerror-local-dev-console

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressing unit_test_data_build

unit_test_data_build is not required for running unit tests. This is a one-time command if one plans to create a new test_data.json object. This would only be necessary if we are changing file formats or adding more objects into the test suite.

The only commands you need to run the test suite are unit_test_data_download and unit_test{_wip,_entrypoints}.

You will continue to get the error:

make: *** No rule to make target 'tests/data/records.fna', needed by 'unit_test_data_build'.  Stop.

If you do not download the entire directory within autometa_test_data/unit_test_data that contains records.fna as well as all of the other files that were used to create test_data.json.

Make is telling you that the file at the path tests/data/records.fna does not exists. This file is not downloaded with unit_test_data_download. This file, as well as a number of other files in autometa_test_data/unit_test_data would need to be downloaded for unit_test_data_build to function appropriately.

Luckily, the user only needs the test_data.json file to run tests, so you can safely ignore the above command when performing unit tests for autometa

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I was able to run the tests using these instructions (and the last commit). Thanks!

$(PYTHON_INTERPRETER) make_test_data.py

## Run all unit tests
unit_test: tests/data/test_data.json
$(PYTHON_INTERPRETER) -m pytest --durations=0 --cov=autometa --emoji --cov-report html
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am getting an error when I run make unit_test:

make: *** No rule to make target 'tests/data/test_data.json', needed by 'unit_test'.  Stop.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make is correctly pointing out that you do not have the test_data.json file to perform tests. You can download this from the [autometa test data] folder on google drive. I've also placed an additional command in the Makefile to perform this download. make unit_test_environment

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make unit_test_environment did not work for me as I didn't have gdown. Can you make it install that as well?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is your conda environment named autometa and is this environment activated? gdown is installed in the command just prior to the attempted download. We can just add gdown in requirements.txt seeing as we plan on using this to write a 'download datasets' helper script


## Run unit tests marked with WIP
unit_test_wip: tests/data/test_data.json
$(PYTHON_INTERPRETER) -m pytest -m "wip" --durations=0 --cov=autometa --emoji --cov-report html

## Run unit tests marked with entrypoint
unit_test_entrypoints: tests/data/test_data.json
$(PYTHON_INTERPRETER) -m pytest -m "entrypoint" --durations=0 --cov=autometa --emoji --cov-report html

test: tests/data/test_data.json
python -m pytest --durations=0 --cov=autometa --emoji --cov-report html

test-wip: tests/data/test_data.json
python -m pytest -m "wip" --durations=0 --cov=autometa --emoji --cov-report html
#################################################################################
# Self Documenting Commands #
#################################################################################

test-entrypoints: tests/data/test_data.json
python -m pytest -m "entrypoint" --durations=0 --cov=autometa --emoji --cov-report html
.DEFAULT_GOAL := help

.PHONY: hello docs clean test test-wip test-entrypoints
# Inspired by <http://marmelab.com/blog/2016/02/29/auto-documented-makefile.html>
# sed script explained:
# /^##/:
# * save line in hold space
# * purge line
# * Loop:
# * append newline + line to hold space
# * go to next line
# * if line starts with doc comment, strip comment character off and loop
# * remove target prerequisites
# * append hold space (+ newline) to line
# * replace newline plus comments by `---`
# * print line
# Separate expressions are necessary because labels cannot be delimited by
# semicolon; see <http://stackoverflow.com/a/11799865/1968>
.PHONY: help
help:
@echo "$$(tput bold)Available rules:$$(tput sgr0)"
@echo
@sed -n -e "/^## / { \
h; \
s/.*//; \
:doc" \
-e "H; \
n; \
s/^## //; \
t doc" \
-e "s/:.*//; \
G; \
s/\\n## /---/; \
s/\\n/ /g; \
p; \
}" ${MAKEFILE_LIST} \
| LC_ALL='C' sort --ignore-case \
| awk -F '---' \
-v ncol=$$(tput cols) \
-v indent=19 \
-v col_on="$$(tput setaf 6)" \
-v col_off="$$(tput sgr0)" \
'{ \
printf "%s%*s%s ", col_on, -indent, $$1, col_off; \
n = split($$2, words, " "); \
line_length = ncol - indent; \
for (i = 1; i <= n; i++) { \
line_length -= length(words[i]) + 1; \
if (line_length <= 0) { \
line_length = ncol - indent - length(words[i]) - 1; \
printf "\n%*s ", -indent, " "; \
} \
printf "%s ", words[i]; \
} \
printf "\n"; \
}' \
| more $(shell test $(shell uname) = Darwin && echo '--no-init --raw-control-chars')
Loading