Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warnings while creating SQLite database #6

Open
breisfeld opened this issue Nov 27, 2017 · 14 comments
Open

Warnings while creating SQLite database #6

breisfeld opened this issue Nov 27, 2017 · 14 comments

Comments

@breisfeld
Copy link

Hi,

Your program medic looks extremely useful.

I am trying to create a medline sql database using files downloaded from ftp://ftp.ncbi.nlm.nih.gov/pubmed/baseline.

After the download, I issue the following shell command:

$ for file in baseline/medline17n*.xml.gz
do
  medic --url sqlite:///medline.db update $file
done

For each file, I get the following database table-related warnings:

K:\Python\Anaconda\lib\site-packages\sqlalchemy\orm\relationships.py:2694: SAWarning: relationship 'Qualifier.citation' will copy column citations.pmid to column qualifiers.pmid, which conflicts with relationship(s): 'Qualifier.descriptor' (copies descriptors.pmid to qualifiers.pmid), 'Descriptor.qualifiers' (copies descriptors.pmid to qualifiers.pmid). Consider applying viewonly=True to read-only relationships, or provide a primaryjoin condition marking writable columns with the foreign() annotation.
  for (pr, fr_) in other_props)
K:\Python\Anaconda\lib\site-packages\sqlalchemy\orm\relationships.py:2694: SAWarning: relationship 'Citation.qualifiers' will copy column citations.pmid to column qualifiers.pmid, which conflicts with relationship(s): 'Qualifier.descriptor' (copies descriptors.pmid to qualifiers.pmid), 'Descriptor.qualifiers' (copies descriptors.pmid to qualifiers.pmid). Consider applying viewonly=True to read-only relationships, or provide a primaryjoin condition marking writable columns with the foreign() annotation.
  for (pr, fr_) in other_props)
K:\Python\Anaconda\lib\site-packages\sqlalchemy\orm\relationships.py:2694: SAWarning: relationship 'Section.citation' will copy column citations.pmid to column sections.pmid, which conflicts with relationship(s): 'Section.abstract' (copies abstracts.pmid to sections.pmid), 'Abstract.sections' (copies abstracts.pmid to sections.pmid). Consider applying viewonly=True to read-only relationships, or provide a primaryjoin condition marking writable columns with the foreign() annotation.
  for (pr, fr_) in other_props)
K:\Python\Anaconda\lib\site-packages\sqlalchemy\orm\relationships.py:2694: SAWarning: relationship 'Citation.sections' will copy column citations.pmid to column sections.pmid, which conflicts with relationship(s): 'Section.abstract' (copies abstracts.pmid to sections.pmid), 'Abstract.sections' (copies abstracts.pmid to sections.pmid). Consider applying viewonly=True to read-only relationships, or provide a primaryjoin condition marking writable columns with the foreign() annotation.
  for (pr, fr_) in other_props)

Platform: Windows 7
Medic: 2.4.1
Python: Python 3.6.3 |Anaconda custom (64-bit)|

@fnl
Copy link
Owner

fnl commented Nov 27, 2017

And you did the setup step to "create" your database, as recommended in Setup? (medic insert --url sqlite:///medline.db 123456)?

@fnl
Copy link
Owner

fnl commented Nov 27, 2017

Forgot to add, as per the Setup instructions: And you created the tables with medic --url sqlite:///medline.db write 123?

@breisfeld
Copy link
Author

$ medic insert --url sqlite:///medline.db 123456```

K:\Python\Anaconda\lib\site-packages\sqlalchemy\orm\relationships.py:2694: SAWarning: relationship 'Qualifier.citation' will copy column citations.pmid to column qualifiers.pmid, which conflicts with relationship(s): 'Qualifier.descriptor' (copies descriptors.pmid to qualifiers.pmid), 'Descriptor.qualifiers' (copies descriptors.pmid to qualifiers.pmid). Consider applying viewonly=True to read-only relationships, or provide a primaryjoin condition marking writable columns with the foreign() annotation.
  for (pr, fr_) in other_props)
K:\Python\Anaconda\lib\site-packages\sqlalchemy\orm\relationships.py:2694: SAWarning: relationship 'Citation.qualifiers' will copy column citations.pmid to column qualifiers.pmid, which conflicts with relationship(s): 'Qualifier.descriptor' (copies descriptors.pmid to qualifiers.pmid), 'Descriptor.qualifiers' (copies descriptors.pmid to qualifiers.pmid). Consider applying viewonly=True to read-only relationships, or provide a primaryjoin condition marking writable columns with the foreign() annotation.
  for (pr, fr_) in other_props)
K:\Python\Anaconda\lib\site-packages\sqlalchemy\orm\relationships.py:2694: SAWarning: relationship 'Section.citation' will copy column citations.pmid to column sections.pmid, which conflicts with relationship(s): 'Section.abstract' (copies abstracts.pmid to sections.pmid), 'Abstract.sections' (copies abstracts.pmid to sections.pmid). Consider applying viewonly=True to read-only relationships, or provide a primaryjoin condition marking writable columns with the foreign() annotation.
  for (pr, fr_) in other_props)
K:\Python\Anaconda\lib\site-packages\sqlalchemy\orm\relationships.py:2694: SAWarning: relationship 'Citation.sections' will copy column citations.pmid to column sections.pmid, which conflicts with relationship(s): 'Section.abstract' (copies abstracts.pmid to sections.pmid), 'Abstract.sections' (copies abstracts.pmid to sections.pmid). Consider applying viewonly=True to read-only relationships, or provide a primaryjoin condition marking writable columns with the foreign() annotation.
  for (pr, fr_) in other_props)
2017-11-27 14:37:24,007 medic.parser CRITICAL: error while parsing PMID 123456
Traceback (most recent call last):
  File "K:/Python/Anaconda/Scripts/medic", line 345, in <module>
    result = Main(args.command, args.files, Session(), not args.all)
  File "K:/Python/Anaconda/Scripts/medic", line 36, in Main
    return insert(session, files_or_pmids, unique)
  File "K:\Python\Anaconda\lib\site-packages\medic\crud.py", line 32, in insert
    _add(session, files_or_pmids, lambda i: session.add(i), uniq)
  File "K:\Python\Anaconda\lib\site-packages\medic\crud.py", line 178, in _add
    count += _downloadAll(session, dbHandle, pmids, unique)
  File "K:\Python\Anaconda\lib\site-packages\medic\crud.py", line 289, in _downloadAll
    return sum(map(streaming, chain(instances)))
  File "K:\Python\Anaconda\lib\site-packages\medic\crud.py", line 211, in _streamInstances
    for citation in _collectCitation(stream):
  File "K:\Python\Anaconda\lib\site-packages\medic\crud.py", line 230, in _collectCitation
    for instance in stream:
  File "K:\Python\Anaconda\lib\site-packages\medic\parser.py", line 81, in parse
    for instance in self.yieldInstances(element):
  File "K:\Python\Anaconda\lib\site-packages\medic\parser.py", line 108, in yieldInstances
    for i in self.yieldFromGenerator(element):
  File "K:\Python\Anaconda\lib\site-packages\medic\parser.py", line 116, in yieldFromGenerator
    instance = getattr(self, element.tag)(element)
  File "K:\Python\Anaconda\lib\site-packages\medic\parser.py", line 490, in MedlineCitation
    return Parser.MedlineCitation(self, element)
  File "K:\Python\Anaconda\lib\site-packages\medic\parser.py", line 146, in MedlineCitation
    created = options['created']
KeyError: 'created'
$ medic --url sqlite:///medline.db write 123

K:\Python\Anaconda\lib\site-packages\sqlalchemy\orm\relationships.py:2694: SAWarning: relationship 'Qualifier.citation' will copy column citations.pmid to column qualifiers.pmid, which conflicts with relationship(s): 'Qualifier.descriptor' (copies descriptors.pmid to qualifiers.pmid), 'Descriptor.qualifiers' (copies descriptors.pmid to qualifiers.pmid). Consider applying viewonly=True to read-only relationships, or provide a primaryjoin condition marking writable columns with the foreign() annotation.
  for (pr, fr_) in other_props)
K:\Python\Anaconda\lib\site-packages\sqlalchemy\orm\relationships.py:2694: SAWarning: relationship 'Citation.qualifiers' will copy column citations.pmid to column qualifiers.pmid, which conflicts with relationship(s): 'Qualifier.descriptor' (copies descriptors.pmid to qualifiers.pmid), 'Descriptor.qualifiers' (copies descriptors.pmid to qualifiers.pmid). Consider applying viewonly=True to read-only relationships, or provide a primaryjoin condition marking writable columns with the foreign() annotation.
  for (pr, fr_) in other_props)
K:\Python\Anaconda\lib\site-packages\sqlalchemy\orm\relationships.py:2694: SAWarning: relationship 'Section.citation' will copy column citations.pmid to column sections.pmid, which conflicts with relationship(s): 'Section.abstract' (copies abstracts.pmid to sections.pmid), 'Abstract.sections' (copies abstracts.pmid to sections.pmid). Consider applying viewonly=True to read-only relationships, or provide a primaryjoin condition marking writable columns with the foreign() annotation.
  for (pr, fr_) in other_props)
K:\Python\Anaconda\lib\site-packages\sqlalchemy\orm\relationships.py:2694: SAWarning: relationship 'Citation.sections' will copy column citations.pmid to column sections.pmid, which conflicts with relationship(s): 'Section.abstract' (copies abstracts.pmid to sections.pmid), 'Abstract.sections' (copies abstracts.pmid to sections.pmid). Consider applying viewonly=True to read-only relationships, or provide a primaryjoin condition marking writable columns with the foreign() annotation.
  for (pr, fr_) in other_props)

@fnl
Copy link
Owner

fnl commented Nov 27, 2017

Seems very peculiar indeed, then. When I have some time to spare, I can try checking if medic still works with OSX and/or Linux. However, note, I have no access to Windows machines.

@breisfeld
Copy link
Author

I tested on OS X (High Sierra, 10.13.1 (17B48)) and get the same warnings.

I wonder if this has to do with differences in the versions of sqlalchemy we are using.

Here is info for my system.

Python 3.6.3 | packaged by conda-forge | (default, Nov  4 2017, 10:13:32)
Type "help", "copyright", "credits" or "license" for more information.
>>> import sqlalchemy
>>> sqlalchemy.__version__
'1.1.13'

@fnl
Copy link
Owner

fnl commented Nov 28, 2017

Could totally be the case; When I developed and worked with this tool, it still was version 0.8 (see the setup.py).

@fnl
Copy link
Owner

fnl commented Nov 28, 2017

Confirmed. Sadly that means something in SQLAlchemy has changed sufficiently to break backwards compatibility. As I don't have enough time to spend on fixing that level of detail, I guess that has put medic out of service. PRs that fix this welcome, while I am sorry to say that I have no ETA on when I can get this fixed.

@fnl fnl added the bug label Nov 28, 2017
@fnl
Copy link
Owner

fnl commented Nov 28, 2017

If it helps, for anyone looking into this, I just tried the various SQLAlchemy versions, including 0.8 that was around when I developed this tool, and none of them fixes the problem. So I guess it's some change to SQLite or the Python API of the same.

@breisfeld
Copy link
Author

I greatly appreciate all of the time you already put into this software and understand your time constraints.
I am not a database person, so I don't know if the warnings are pointing to anything significant or just what the current sqlalchemy considers poor practices. I may try to dig into this if I can find some time.

@fnl
Copy link
Owner

fnl commented Nov 29, 2017

Well, as I tried building medic in environment with all SQLAlchemy versions back to 0.8, and no-one works, I have my doubts that is a SQLAlchemy problem at all. Because it seems weird why this issue would pop up now, but never popped before using the same earlier versions (I think I was using, e.g., 0.9, pretty nicely. But yes, I could be wrong on that count, e.g., if medic was working due to some bug in SQLAlchemy that got patched in later versions of the 0.8 release.

@fnl
Copy link
Owner

fnl commented Nov 29, 2017

BTW, if I'm correct, medic still should work with PostrgreSQL.

@fnl fnl changed the title Warnings while creating database Warnings while creating SQLite database Nov 29, 2017
@breisfeld
Copy link
Author

That is odd. It seems like the messages are emitted by SQLAlchemy, but perhaps those are a side effect of changes to sqlite/pysqlite.

@byeongchul
Copy link

byeongchul commented Apr 6, 2020

I faced a similar problem for baseline 2020 data, and found there is no DateCreated terms in current pubmed DTD.

$ medic --url sqlite:///tmp.db insert 123456 

...

 in MedlineCitation
    return Parser.MedlineCitation(self, element)
  File "/home/bckang/.conda/envs/nlp/lib/python3.7/site-packages/medic/parser.py", line 146, in MedlineCitation
    created = options['created']
KeyError: 'created'

@byeongchul
Copy link

byeongchul commented Apr 7, 2020

As my understanding, <DataCreated> element was removed in 2018 (https://www.nlm.nih.gov/bsd/licensee/elements_descriptions.html)

image

@fnl fnl added the helpwanted label Apr 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants