-
Notifications
You must be signed in to change notification settings - Fork 3
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* WIP:nltk trees * readme and doc strings * more addition to readme * alleles_pattern_nltk.json * tests updated * fixed the failing tests * updating psuedo grammar * more to tests * readme updated * bug fix in nltk_trees * Nltk trees manu (#39) * fix fpbase things * intermediate fix * simplified version, new grammar, does not handle split * half way * added poetry dependencies * simple version working * fix tests * update gitignore * Ci workflow (#38) * ci_yaml and docker * updating ci.yaml * updating ci.yaml * dockerfile updated * fixing ci Co-authored-by: Anamika Yadav <anamika310.yadav@gmail.com> * fix ci line * remove docker action * make action run at each push * download tags in CI * silly mistake CI fixed * fixed error Co-authored-by: Anamika Yadav <anamika310.yadav@gmail.com> Co-authored-by: Manuel Lera Ramirez <manulera14@gmail.com>
- Loading branch information
1 parent
f35eb72
commit 50df31f
Showing
24 changed files
with
829 additions
and
106 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
.vscode/ | ||
examples/ | ||
.github/ | ||
.venv/ | ||
.git/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,57 @@ | ||
name: Python tests | ||
on: [push, pull_request] | ||
jobs: | ||
test: | ||
runs-on: ubuntu-20.04 | ||
steps: | ||
- name: checkout | ||
uses: actions/checkout@v3 | ||
with: | ||
fetch-depth: 0 | ||
- name: Install Python | ||
uses: actions/setup-python@v1 | ||
with: | ||
python-version: 3.9 | ||
- name: Install dependencies | ||
run: | | ||
python -m pip install --upgrade pip | ||
pip install poetry | ||
poetry config virtualenvs.create false | ||
poetry install --no-dev | ||
# Before running the test you have to download the tags! | ||
- name: Run tests | ||
run: | | ||
python get_data/get_fpbase_data.py allele_components/tags_fpbase.toml | ||
cd genestorian_module/test | ||
python -m unittest | ||
# Update docker image when committing to master branch if tests pass | ||
# push_to_registry: | ||
# name: Push Docker image to Docker Hub | ||
# runs-on: ubuntu-latest | ||
# needs: test | ||
# if: github.ref == 'refs/heads/master' | ||
# steps: | ||
# - name: Check out the repo | ||
# uses: actions/checkout@v3 | ||
|
||
# - name: Log in to Docker Hub | ||
# uses: docker/login-action@v2 | ||
# with: | ||
# username: ${{ secrets.DOCKER_USERNAME }} | ||
# password: ${{ secrets.DOCKER_PASSWORD }} | ||
|
||
# - name: Extract metadata (tags, labels) for Docker | ||
# id: meta | ||
# uses: docker/metadata-action@v2 | ||
# with: | ||
# images: genestorian_refinement_pipeline | ||
|
||
# - name: Build and push Docker images | ||
# uses: docker/build-push-action@v3.1.1 | ||
|
||
# with: | ||
# context: . | ||
# push: true | ||
# tags: manulera/genestorian_refinement_pipeline:latest | ||
# labels: ${{ steps.meta.outputs.labels }} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
FROM python:3.9 | ||
|
||
WORKDIR /pipeline | ||
|
||
RUN pip install poetry | ||
RUN pip install nltk | ||
RUN pip install toml | ||
|
||
COPY ./ /pipeline/ | ||
|
||
RUN poetry config virtualenvs.create false | ||
RUN poetry install --without dev | ||
RUN poetry shell | ||
|
||
COPY . /pipeline | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
for lab in *_lab | ||
do | ||
cd $lab | ||
if test -f "format.py"; then | ||
echo "running in $lab" | ||
python format.py | ||
fi | ||
cd .. | ||
done |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
# %% | ||
|
||
from nltk.chunk.regexp import RegexpChunkRule, ChunkString | ||
import re | ||
from nltk.tree import Tree | ||
from nltk.chunk import RegexpParser | ||
# %% | ||
grammar = """ | ||
GENE_DELETION|BLAH: {<GENE><SPACER>?<other>?<SPACER>?<MARKER>} | ||
""" | ||
|
||
custom_tag_parser = RegexpParser(grammar, root_label='ROOT') | ||
|
||
input = Tree('ROOT', [ | ||
Tree('GENE', ['mph1']), | ||
Tree('SPACER', ['::']), | ||
Tree('other', ['hello']), | ||
Tree('SPACER', ['::']), | ||
Tree('MARKER', ['kanr']) | ||
]) | ||
result: Tree = custom_tag_parser.parse_all(input) | ||
# custom_tag_parser | ||
|
||
# %% | ||
|
||
# match = re.match('(aa)aa', 'aaaa') | ||
# match.group() | ||
# %% | ||
cs = ChunkString(input) | ||
|
||
rule = RegexpChunkRule.fromstring( | ||
'{<GENE><SPACER>?<other>?<SPACER>?<MARKER>}') | ||
|
||
print(rule._regexp) | ||
|
||
match = re.match(rule._regexp, cs._str) | ||
print(rule._regexp) | ||
print(match.groups()) | ||
# cs.xform(rule._regexp, '{\g<chunk>}') | ||
rule._regexp.flags | ||
# print(cs._str) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,9 +1,3 @@ | ||
Metadata-Version: 2.1 | ||
Name: genestorian-module | ||
Version: 0.0.0 | ||
Summary: UNKNOWN | ||
License: UNKNOWN | ||
Platform: UNKNOWN | ||
|
||
UNKNOWN | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,10 +1,16 @@ | ||
setup.py | ||
genestorian_module/__init__.py | ||
genestorian_module/converge.py | ||
genestorian_module/fourth_version_pipeline.py | ||
genestorian_module/build_nltk_tags.py | ||
genestorian_module/build_nltk_trees.py | ||
genestorian_module/replace_feature.py | ||
genestorian_module/summary_nltk_tags.py | ||
genestorian_module/third_version_pipeline.py | ||
genestorian_module.egg-info/PKG-INFO | ||
genestorian_module.egg-info/SOURCES.txt | ||
genestorian_module.egg-info/dependency_links.txt | ||
genestorian_module.egg-info/top_level.txt | ||
genestorian_module.egg-info/top_level.txt | ||
test/test_build_nltk_tags.py | ||
test/test_build_grammar.py | ||
test/test_build_nltk_tags.py | ||
test/test_nltk_trees.py | ||
test/test_summary_nltk_tags.py |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.