Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error building db, fixed using older versions from NCBI #146

Closed
nickp60 opened this issue Oct 11, 2021 · 1 comment
Closed

Error building db, fixed using older versions from NCBI #146

nickp60 opened this issue Oct 11, 2021 · 1 comment

Comments

@nickp60
Copy link

nickp60 commented Oct 11, 2021

Hello, I am getting the following error after running this command to build a taxonomy database.

docker run --rm -it -v $(pwd):$(pwd) -w $(pwd) nghoffman/taxtastic:0.8.3 taxit -vvv new_database taxonomy.db
docker run --rm -it -v $(pwd):$(pwd) -w $(pwd) nghoffman/taxtastic:0.8.3 taxit -vvv new_database taxonomy.db
INFO ncbi 507 downloading ftp://ftp.ncbi.nih.gov/pub/taxonomy/taxdmp.zip to /Users/nwaters/GitHub/conditioning/mitre/taxdmp.zip
INFO ncbi 230 Clobbering database tables
2021-10-08 20:33:33,019 INFO sqlalchemy.engine.base.Engine SELECT CAST('test plain returns' AS VARCHAR(60)) AS anon_1
INFO log 109 SELECT CAST('test plain returns' AS VARCHAR(60)) AS anon_1
2021-10-08 20:33:33,019 INFO sqlalchemy.engine.base.Engine ()
INFO log 109 ()
2021-10-08 20:33:33,019 INFO sqlalchemy.engine.base.Engine SELECT CAST('test unicode returns' AS VARCHAR(60)) AS anon_1
INFO log 109 SELECT CAST('test unicode returns' AS VARCHAR(60)) AS anon_1
2021-10-08 20:33:33,020 INFO sqlalchemy.engine.base.Engine ()
INFO log 109 ()
2021-10-08 20:33:33,021 INFO sqlalchemy.engine.base.Engine PRAGMA table_info("ranks")
INFO log 109 PRAGMA table_info("ranks")
2021-10-08 20:33:33,021 INFO sqlalchemy.engine.base.Engine ()
INFO log 109 ()
2021-10-08 20:33:33,025 INFO sqlalchemy.engine.base.Engine PRAGMA table_info("source")
INFO log 109 PRAGMA table_info("source")
2021-10-08 20:33:33,025 INFO sqlalchemy.engine.base.Engine ()
INFO log 109 ()
2021-10-08 20:33:33,027 INFO sqlalchemy.engine.base.Engine PRAGMA table_info("nodes")
INFO log 109 PRAGMA table_info("nodes")
2021-10-08 20:33:33,027 INFO sqlalchemy.engine.base.Engine ()
INFO log 109 ()
2021-10-08 20:33:33,028 INFO sqlalchemy.engine.base.Engine PRAGMA table_info("names")
INFO log 109 PRAGMA table_info("names")
2021-10-08 20:33:33,028 INFO sqlalchemy.engine.base.Engine ()
INFO log 109 ()
2021-10-08 20:33:33,028 INFO sqlalchemy.engine.base.Engine PRAGMA table_info("merged")
INFO log 109 PRAGMA table_info("merged")
2021-10-08 20:33:33,028 INFO sqlalchemy.engine.base.Engine ()
INFO log 109 ()
2021-10-08 20:33:33,029 INFO sqlalchemy.engine.base.Engine 
DROP TABLE source
INFO log 109 
DROP TABLE source
2021-10-08 20:33:33,029 INFO sqlalchemy.engine.base.Engine ()
INFO log 109 ()
2021-10-08 20:33:33,054 INFO sqlalchemy.engine.base.Engine COMMIT
INFO log 109 COMMIT
2021-10-08 20:33:33,055 INFO sqlalchemy.engine.base.Engine 
DROP TABLE ranks
INFO log 109 
DROP TABLE ranks
2021-10-08 20:33:33,055 INFO sqlalchemy.engine.base.Engine ()
INFO log 109 ()
2021-10-08 20:33:33,089 INFO sqlalchemy.engine.base.Engine COMMIT
INFO log 109 COMMIT
INFO ncbi 233 Creating database tables
2021-10-08 20:33:33,092 INFO sqlalchemy.engine.base.Engine PRAGMA table_info("ranks")
INFO log 109 PRAGMA table_info("ranks")
2021-10-08 20:33:33,092 INFO sqlalchemy.engine.base.Engine ()
INFO log 109 ()
2021-10-08 20:33:33,096 INFO sqlalchemy.engine.base.Engine PRAGMA table_info("source")
INFO log 109 PRAGMA table_info("source")
2021-10-08 20:33:33,098 INFO sqlalchemy.engine.base.Engine ()
INFO log 109 ()
2021-10-08 20:33:33,100 INFO sqlalchemy.engine.base.Engine PRAGMA table_info("nodes")
INFO log 109 PRAGMA table_info("nodes")
2021-10-08 20:33:33,100 INFO sqlalchemy.engine.base.Engine ()
INFO log 109 ()
2021-10-08 20:33:33,100 INFO sqlalchemy.engine.base.Engine PRAGMA table_info("names")
INFO log 109 PRAGMA table_info("names")
2021-10-08 20:33:33,100 INFO sqlalchemy.engine.base.Engine ()
INFO log 109 ()
2021-10-08 20:33:33,101 INFO sqlalchemy.engine.base.Engine PRAGMA table_info("merged")
INFO log 109 PRAGMA table_info("merged")
2021-10-08 20:33:33,101 INFO sqlalchemy.engine.base.Engine ()
INFO log 109 ()
2021-10-08 20:33:33,102 INFO sqlalchemy.engine.base.Engine 
CREATE TABLE ranks (
        rank VARCHAR NOT NULL, 
        height INTEGER NOT NULL, 
        PRIMARY KEY (rank), 
        UNIQUE (height)
)


INFO log 109 
CREATE TABLE ranks (
        rank VARCHAR NOT NULL, 
        height INTEGER NOT NULL, 
        PRIMARY KEY (rank), 
        UNIQUE (height)
)


2021-10-08 20:33:33,102 INFO sqlalchemy.engine.base.Engine ()
INFO log 109 ()
2021-10-08 20:33:33,124 INFO sqlalchemy.engine.base.Engine COMMIT
INFO log 109 COMMIT
2021-10-08 20:33:33,127 INFO sqlalchemy.engine.base.Engine 
CREATE TABLE source (
        id INTEGER NOT NULL, 
        name VARCHAR, 
        description VARCHAR, 
        PRIMARY KEY (id), 
        UNIQUE (name)
)


INFO log 109 
CREATE TABLE source (
        id INTEGER NOT NULL, 
        name VARCHAR, 
        description VARCHAR, 
        PRIMARY KEY (id), 
        UNIQUE (name)
)


2021-10-08 20:33:33,127 INFO sqlalchemy.engine.base.Engine ()
INFO log 109 ()
2021-10-08 20:33:33,167 INFO sqlalchemy.engine.base.Engine COMMIT
INFO log 109 COMMIT
2021-10-08 20:33:33,170 INFO sqlalchemy.engine.base.Engine 
CREATE TABLE nodes (
        tax_id VARCHAR NOT NULL, 
        parent_id VARCHAR, 
        rank VARCHAR, 
        embl_code VARCHAR, 
        division_id VARCHAR, 
        source_id INTEGER, 
        is_valid BOOLEAN, 
        PRIMARY KEY (tax_id), 
        FOREIGN KEY(rank) REFERENCES ranks (rank), 
        FOREIGN KEY(source_id) REFERENCES source (id), 
        CHECK (is_valid IN (0, 1))
)


INFO log 109 
CREATE TABLE nodes (
        tax_id VARCHAR NOT NULL, 
        parent_id VARCHAR, 
        rank VARCHAR, 
        embl_code VARCHAR, 
        division_id VARCHAR, 
        source_id INTEGER, 
        is_valid BOOLEAN, 
        PRIMARY KEY (tax_id), 
        FOREIGN KEY(rank) REFERENCES ranks (rank), 
        FOREIGN KEY(source_id) REFERENCES source (id), 
        CHECK (is_valid IN (0, 1))
)


2021-10-08 20:33:33,170 INFO sqlalchemy.engine.base.Engine ()
INFO log 109 ()
2021-10-08 20:33:33,193 INFO sqlalchemy.engine.base.Engine COMMIT
INFO log 109 COMMIT
2021-10-08 20:33:33,193 INFO sqlalchemy.engine.base.Engine CREATE INDEX ix_nodes_parent_id ON nodes (parent_id)
INFO log 109 CREATE INDEX ix_nodes_parent_id ON nodes (parent_id)
2021-10-08 20:33:33,193 INFO sqlalchemy.engine.base.Engine ()
INFO log 109 ()
2021-10-08 20:33:33,221 INFO sqlalchemy.engine.base.Engine COMMIT
INFO log 109 COMMIT
2021-10-08 20:33:33,222 INFO sqlalchemy.engine.base.Engine 
CREATE TABLE names (
        tax_id VARCHAR NOT NULL, 
        tax_name VARCHAR NOT NULL, 
        unique_name VARCHAR, 
        name_class VARCHAR NOT NULL, 
        source_id INTEGER, 
        is_primary BOOLEAN, 
        is_classified BOOLEAN, 
        PRIMARY KEY (tax_id, tax_name, name_class), 
        FOREIGN KEY(tax_id) REFERENCES nodes (tax_id) ON DELETE CASCADE, 
        FOREIGN KEY(source_id) REFERENCES source (id), 
        CHECK (is_primary IN (0, 1)), 
        CHECK (is_classified IN (0, 1))
)


INFO log 109 
CREATE TABLE names (
        tax_id VARCHAR NOT NULL, 
        tax_name VARCHAR NOT NULL, 
        unique_name VARCHAR, 
        name_class VARCHAR NOT NULL, 
        source_id INTEGER, 
        is_primary BOOLEAN, 
        is_classified BOOLEAN, 
        PRIMARY KEY (tax_id, tax_name, name_class), 
        FOREIGN KEY(tax_id) REFERENCES nodes (tax_id) ON DELETE CASCADE, 
        FOREIGN KEY(source_id) REFERENCES source (id), 
        CHECK (is_primary IN (0, 1)), 
        CHECK (is_classified IN (0, 1))
)


2021-10-08 20:33:33,222 INFO sqlalchemy.engine.base.Engine ()
INFO log 109 ()
2021-10-08 20:33:33,248 INFO sqlalchemy.engine.base.Engine COMMIT
INFO log 109 COMMIT
2021-10-08 20:33:33,249 INFO sqlalchemy.engine.base.Engine CREATE INDEX ix_names_tax_id_is_primary ON names (tax_id, is_primary)
INFO log 109 CREATE INDEX ix_names_tax_id_is_primary ON names (tax_id, is_primary)
2021-10-08 20:33:33,249 INFO sqlalchemy.engine.base.Engine ()
INFO log 109 ()
2021-10-08 20:33:33,264 INFO sqlalchemy.engine.base.Engine COMMIT
INFO log 109 COMMIT
2021-10-08 20:33:33,266 INFO sqlalchemy.engine.base.Engine 
CREATE TABLE merged (
        old_tax_id VARCHAR NOT NULL, 
        new_tax_id VARCHAR, 
        PRIMARY KEY (old_tax_id), 
        FOREIGN KEY(new_tax_id) REFERENCES nodes (tax_id) ON DELETE CASCADE
)


INFO log 109 
CREATE TABLE merged (
        old_tax_id VARCHAR NOT NULL, 
        new_tax_id VARCHAR, 
        PRIMARY KEY (old_tax_id), 
        FOREIGN KEY(new_tax_id) REFERENCES nodes (tax_id) ON DELETE CASCADE
)


2021-10-08 20:33:33,266 INFO sqlalchemy.engine.base.Engine ()
INFO log 109 ()
2021-10-08 20:33:33,306 INFO sqlalchemy.engine.base.Engine COMMIT
INFO log 109 COMMIT
2021-10-08 20:33:33,307 INFO sqlalchemy.engine.base.Engine CREATE INDEX ix_merged_old_tax_id ON merged (old_tax_id)
INFO log 109 CREATE INDEX ix_merged_old_tax_id ON merged (old_tax_id)
2021-10-08 20:33:33,307 INFO sqlalchemy.engine.base.Engine ()
INFO log 109 ()
2021-10-08 20:33:33,344 INFO sqlalchemy.engine.base.Engine COMMIT
INFO log 109 COMMIT
INFO ncbi 365 loading ranks
INFO ncbi 373 loading nodes
INFO ncbi 379 loading names
Traceback (most recent call last):
  File "/usr/local/bin/taxit", line 22, in <module>
    sys.exit(main(sys.argv[1:]))
  File "/usr/local/lib/python2.7/site-packages/taxtastic/scripts/taxit.py", line 49, in main
    return action(arguments)
  File "/usr/local/lib/python2.7/site-packages/taxtastic/subcommands/new_database.py", line 86, in action
    ncbi_loader.load_archive(zfile)
  File "/usr/local/lib/python2.7/site-packages/taxtastic/ncbi.py", line 382, in load_archive
    self.load_table('names', rows=names_rows)
  File "/usr/local/lib/python2.7/site-packages/taxtastic/ncbi.py", line 343, in load_table
    cur.executemany(cmd, itertools.islice(rows, limit))
sqlite3.IntegrityError: UNIQUE constraint failed: names.tax_id, names.tax_name, names.name_class

It looks like older versions work: I ended up trying the 2018-12-01 version arbitrarily, and that was fine:

sudo docker run --rm -it -v $(pwd):$(pwd) -w $(pwd) nghoffman/taxtastic:0.8.3 taxit -vvv  new_database -u ftp://ftp.ncbi.nih.gov/pub/taxonomy/taxdump_archive/new_taxdump_2018-12-01.zip
@nhoffman
Copy link
Member

Trying to clean up some issues - I suspect that you have either solved this or moved on, but hopefully this issue is fixed in v0.10.0 - if not feel free to reopen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants