New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Internet tab added to Media category. #1200
Conversation
A unit test is failing. |
I have spent some days trying to understand why the
test_tcg_and_check_and_repair test fails, but I am sorry to say that I
still don't.
There are variations of this failure for my PR's adding internet tab to
citations, events and families, but the one for sources passes that test.
I think it is awkward that the test_tcg_and_check_and_repair test is
sensitive to the addition of the url in the database tables, and have
not being able to find out why.
From what I understand, the test_tcg_and_check_and_repair test:
1. imports a static gramps file into a test database,
2. runs the test case generation tool which adds a lot of entries in
the test database, and
3. runs the check and repair tool and checks if the outcome is as
expected (a static value - see lines 139-159 in tools_test.py), and
if not, the test fails.
Now I wonder if the expected outcome is by design or if it rather is a
capture of a particular test case run (which should suffice as a
regression test). My feeling is that it is the latter. One thing that
gives me that feeling is that I have found a bug in check.py, which is
invoked as a result of the test_tcg_and_check_and_repair test which in
turn is implemented in tools_test.py. In check.py, on lines 850-858, the
column indices of the change time stamp for the database objects are
wrong for both places (11 - should be 15) and media (8 - should be 9
before this PR and 10 after). When correcting this in gramps 5.1
(without any PR), the test outcome changes and the test fails.
My final try to get an understanding was to capture the database
contents after step 2 above at test runs without and with the PR and
comparing those, but there where too many differences (disregarding the
differing change time stamps) between the two to give me any clue. I
believe that the test case generation tool is sensitive to the PR
changes, but I can not understand why.
I could of course just change the expected output in tools_test.py to
match the output of a test run with the PR, but the test would likely
break again when adding any of the other PR:s.
Please advice me what I should do regarding this problem.
Jan
…On 2021-05-21 17:36, Nick Hall wrote:
A unit test is failing.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#1200 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACV75AAU5IMTD2HDAURIJV3TOZ4XZANCNFSM43MKMNCQ>.
|
As you have surmised, this tool is primarily testing to see if there are changes by comparing a test db to a previous run by the last developer. The db generation uses random numbers to create the objects for test and is VERY sensitive to changes. As to why, I'm not sure, perhaps some part of the random number generation is in some way content sensitive, you would have to look at how each of the object generation code items was done to see if you could find this. Note, I have also found that the check is very sensitive to settings for Gramps. When it runs in the Travis, it is running as a clean install. When run on your own systems (or mine) it may be using different settings, resulting in the random object generation producing different results. So trying to get a run result on your own system that matches the template values in tools_test.py is difficult at best. For this reason, when I update template values in tools_test.py for the result, I take them from the Travis run from the PR, after very carefully making sure that the resulting changes due to the other PR elements are really doing what we want. I think it would be a good idea to get the check.py cleanup_empty_objects CHANGE_ table fixed up first (this is a poor design, in that it doesn't get automatically fixed when objects are redesigned). Then capture the output of the test, and try your PR again, with the CHANGE_ values updated for the PR as well as the wrong media and place values. Than MAY be enough to allow a test match. I'm guessing this, as if the CHANGE_* values are wrong, the check might be incorrectly detecting empty objects. And this may affect some of the other test results. |
I have not (yet) done any more effort in order to find out why the db
test case generation is affected.
At least, I am able to run setup.py in my gramps eclipse project and get
exactly the same outcome as the Travis run, with one exception.
ExportControl in exports_test.py fails. TIME values are 1 hour off, VERS
(gramps version) and DATE (of gramps) is wrong, and some FILE paths
differ. I have not (yet) bothered to investigate why. (The latest gramps
commit I have in the master branch in eclipse is e3b6076... "Merge
branch 'gramps51'". The "Internet tab added to XX category" branches
spawns off from that commit.)
Regarding the column indices in the database blobs, You may have noted
that I did a half-hearted job in my "Internet tab added to XX category"
PR's. I created a "column" tuple that actually replicates the
information already there in the result returned by the get_schema()
function. One could maybe go one step further and create the "column"
tuple from the schema, e.g. by calling get_schema() and "filter out" the
column names in the __init__() function. Such a "filter-out-columns or
get-columns or ..." function could maybe be hosted by the PrimaryObject
class. Anyway, I used that column tuple in upgrade.py. It works, but it
may not be the most elegant way to do it. Python is still a rather new
experience to me.
To make check.py resilient to schema changes, one has to decide if one
should go along the lines I took, or if one should do something better,
and then apply the change to all primary objects.
So I wonder what the way forward is. Is it:
1. Create a PR that fixes the column index issue. Would maybe affect
PrimaryObject and all its subclasses, check.py and upgrade.py (and
maybe something else not yet identified)
2. Update the "Internet tab added to XX category" PR's to match the
changes made by the PR in step 1.
3. Keep a blind eye regarding Travis failing in the
test_tcg_and_check_and_repair test.
4. When the PR:s chosen by you that are related to this are merged into
the master branch, adjust tools_test.py to make Travis happy.
Or something else?
Jan
…On 2021-05-26 16:35, Paul Culley wrote:
As you have surmised, this tool is primarily testing to see if there
are changes by comparing a test db to a previous run by the last
developer. The db generation uses random numbers to create the objects
for test and is VERY sensitive to changes. As to why, I'm not sure,
perhaps some part of the random number generation is in some way
content sensitive, you would have to look at how each of the object
generation code items was done to see if you could find this.
Note, I have also found that the check is very sensitive to settings
for Gramps. When it runs in the Travis, it is running as a clean
install. When run on your own systems (or mine) it may be using
different settings, resulting in the random object generation
producing different results. So trying to get a run result on your own
system that matches the template values in tools_test.py is difficult
at best. For this reason, when I update template values in
tools_test.py for the result, I take them from the Travis run from the
PR, after very carefully making sure that the resulting changes due to
the other PR elements are really doing what we want.
I think it would be a good idea to get the check.py
cleanup_empty_objects CHANGE_ table fixed up first (this is a poor
design, in that it doesn't get automatically fixed when objects are
redesigned). Then capture the output of the test, and try your PR
again, with the CHANGE_ values updated for the PR as well as the wrong
media and place values. Than MAY be enough to allow a test match. I'm
guessing this, as if the CHANGE_* values are wrong, the check might be
incorrectly detecting empty objects. And this may affect some of the
other test results.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#1200 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACV75AASILIZKIUJ5GZ6PWDTPUBJRANCNFSM43MKMNCQ>.
|
In my opinion, I would not at this point try to fix the column index issue for all classes. Instead I would just fix up check.py table. The reason is that we may be sometime relatively soon doing a large change to the db underpinnings; convert stored data from pickled serialized objects to json objects. If we so this, we would lose the serialized data calls to the db and have to make a lot of changes to code like check.py to deal with this. And a more general method of column indexing would become unnecessary. I would make the check.py fix for the media and place a separate commit as part of this PR; if you made it a PR on its own, then this would become a dependent PR with potential to get held up. I would not try to satisfy Travis in that commit, but rather deal with it as part of the overall PR. That way you only have to do it once. Additional changes to that table for your new URL would be part of the overall PR commits. In general, we need to make Travis happy before a PR gets accepted, so you may end up dealing with this more than once, depending on when your PR is accepted. Regarding your other URL PRs, we are back to the dependency issues, depending on what gets accepted first. I might be inclined to just let them stay 'in progress' until this one is accepted and then fix them up to work, after rebasing them on the master commits that result from this PR. There are a couple of other significant PRs (place updates, and UIDs) that will also have dependency issues like this; whichever gets accepted first will make a bunch of work for the others to make sure that it all comes out correctly. |
Are You saying that column indexing will be replaced by dictionaries,
i.e. key:value pairs in the future?
Are You also saying that I, for each of the five "Internet tab added to
XX category" PR's should:
1. On top of what I have in those branches make a commit (A) that:
* changes the value of CHANGE_PLACE from 11 to 15 in check.py
* changes the value of CHANGE_XX to its correct valuein check.py
2. On top of that commit make a commit (B) that (Should be numbered 2.
Thunderbird problem :( )
* changes tools_test.py lines 139-159 to match with the Travis output.
Then, whenever Travis signals that the test_tcg_and_check_and_repair
fails for any of the "Internet tab added to XX category" PR's:
1. Revert the (B) commit (have never done so - maybe it would be a Hard
Reset?)
2. Change:
* tools_test.py: Change lines 139-159 to match with the Travis output
* check.py: Change CHANGE_YY for any YY that have moved into the
master branch
3. Make a new commit (B')
That would sum up to making 15 (B) commits (5 + 4 + 3 + 2 + 1). Puh ...
Jan
…On 2021-05-27 16:39, Paul Culley wrote:
In my opinion, I would not at this point try to fix the column index
issue for all classes. Instead I would just fix up check.py table. The
reason is that we may be sometime relatively soon doing a large change
to the db underpinnings; convert stored data from pickled serialized
objects to json objects. If we so this, we would lose the serialized
data calls to the db and have to make a lot of changes to code like
check.py to deal with this. And a more general method of column
indexing would become unnecessary.
I would make the check.py fix for the media and place a separate
commit as part of this PR; if you made it a PR on its own, then this
would become a dependent PR with potential to get held up. I would not
try to satisfy Travis in that commit, but rather deal with it as part
of the overall PR. That way you only have to do it once. Additional
changes to that table for your new URL would be part of the overall PR
commits.
In general, we need to make Travis happy before a PR gets accepted, so
you may end up dealing with this more than once, depending on when
your PR is accepted.
Regarding your other URL PRs, we are back to the dependency issues,
depending on what gets accepted first. I might be inclined to just let
them stay 'in progress' until this one is accepted and then fix them
up to work, after rebasing them on the master commits that result from
this PR.
There are a couple of other significant PRs (place updates, and UIDs)
that will also have dependency issues like this; whichever gets
accepted first will make a bunch of work for the others to make sure
that it all comes out correctly.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#1200 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACV75AGECZXPIT674HYKJWLTPZKR7ANCNFSM43MKMNCQ>.
|
I believe that the PR now passes the tests. Should I wait for this one to be merged, then update one of the others? Conflicts will occur on upgrade.py, check.py, and possibly tools_test.py whenever any of them are merged, so I need to rebase the PR branches. Jan |
If it was me, I would wait. Less work, avoiding the merge conflicts multiple times etc. And if there are any other concerns with the way this is done when Nick H reviews, you could make appropriate changes to the others to match. |
Travis say for you:
Summary: gramps/gen/db/upgrade.py: gramps/plugins/importer/importxml.py: After change every file Travis run check your code again. |
I agree with adding urls to all primary objects except for sources and citations. I think that they should be handled slightly differently because an url actually forms part of a citation. Having said that, I'm about to review all pull requests in light of the release of GEDCOM 7.0 today. |
I don't agree for citations. We can have url associated: |
@SNoiraud Sorry, I didn't make myself very clear. I just think that they need to be handled slightly differently for citations. Typically, a citation will only have one url and it will form part of the formatted citation, together with the date that it was accessed. Both these fields should probably be available in the main tab of citation editor. If we used attributes to record such information we could define a special attribute to provide a formatted citation. For example, an attribute called "format" could contain the string "{page} ({url}, accessed: {date})". In GEDCOM 7.0, the recommendation seems to be to store such information in the PAGE tag. |
I am on thin ice now, but I wonder if it could the case that GEDCOM 7.0
allows media (MULTIMEDIA_RECORD) to be a web-accessible file e.g. in the
form of text/html.
If gramps would allow that, then one could put web links in the media
galleries rather than under a separate Internet tab.
And, in GEDCOM 7.0, FAMILY_RECORD, INDIVIDUAL_RECORD, SOURCE_RECORD,
SOURCE_CITATION, EVENT_DETAIL may contain MULTIMEDIA_LINK's.
PLACE_STRUCTURE however lacks MULTIMEDIA_LINK's.
Jan
…On 2021-06-08 19:53, Nick Hall wrote:
@SNoiraud <https://github.com/SNoiraud> Sorry, I didn't make myself
very clear. I just think that they need to be handled slightly
differently for citations.
Typically, a citation will only have one url and it will form part of
the formatted citation, together with the date that it was accessed.
Both these fields should probably be available in the main tab of
citation editor. If we used attributes to record such information we
could define a special attribute to provide a formatted citation. For
example, an attribute called "format" could contain the string "{page}
({url}, accessed: {date})".
In GEDCOM 7.0, the recommendation seems to be to store such
information in the PAGE tag.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#1200 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACV75ADPBLGERKFPQ37CCR3TRZKKFANCNFSM43MKMNCQ>.
|
The GEDCOM 7.0 spec does allow URLs as the source of the file for multimedia. So does Gramps, theoretically, although it only works in pretty special circumstances. |
Closed in favour of implementing web-accessible file references in media objects. See the Gedcom 7.0 FILE tag specification. |
Supports database upgrade, export and import of XML as well as web report.