Internet tab added to Media category. #1200

janskarvall · 2021-04-22T12:09:19Z

Supports database upgrade, export and import of XML as well as web report.

Nick-Hall · 2021-05-21T15:36:07Z

A unit test is failing.

janskarvall · 2021-05-25T20:58:41Z

I have spent some days trying to understand why the test_tcg_and_check_and_repair test fails, but I am sorry to say that I still don't. There are variations of this failure for my PR's adding internet tab to citations, events and families, but the one for sources passes that test. I think it is awkward that the test_tcg_and_check_and_repair test is sensitive to the addition of the url in the database tables, and have not being able to find out why. From what I understand, the test_tcg_and_check_and_repair test: 1. imports a static gramps file into a test database, 2. runs the test case generation tool which adds a lot of entries in the test database, and 3. runs the check and repair tool and checks if the outcome is as expected (a static value - see lines 139-159 in tools_test.py), and if not, the test fails. Now I wonder if the expected outcome is by design or if it rather is a capture of a particular test case run (which should suffice as a regression test). My feeling is that it is the latter. One thing that gives me that feeling is that I have found a bug in check.py, which is invoked as a result of the test_tcg_and_check_and_repair test which in turn is implemented in tools_test.py. In check.py, on lines 850-858, the column indices of the change time stamp for the database objects are wrong for both places (11 - should be 15) and media (8 - should be 9 before this PR and 10 after). When correcting this in gramps 5.1 (without any PR), the test outcome changes and the test fails. My final try to get an understanding was to capture the database contents after step 2 above at test runs without and with the PR and comparing those, but there where too many differences (disregarding the differing change time stamps) between the two to give me any clue. I believe that the test case generation tool is sensitive to the PR changes, but I can not understand why. I could of course just change the expected output in tools_test.py to match the output of a test run with the PR, but the test would likely break again when adding any of the other PR:s. Please advice me what I should do regarding this problem. Jan

…

On 2021-05-21 17:36, Nick Hall wrote: A unit test is failing. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#1200 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACV75AAU5IMTD2HDAURIJV3TOZ4XZANCNFSM43MKMNCQ>.

prculley · 2021-05-26T14:34:41Z

As you have surmised, this tool is primarily testing to see if there are changes by comparing a test db to a previous run by the last developer. The db generation uses random numbers to create the objects for test and is VERY sensitive to changes. As to why, I'm not sure, perhaps some part of the random number generation is in some way content sensitive, you would have to look at how each of the object generation code items was done to see if you could find this.

Note, I have also found that the check is very sensitive to settings for Gramps. When it runs in the Travis, it is running as a clean install. When run on your own systems (or mine) it may be using different settings, resulting in the random object generation producing different results. So trying to get a run result on your own system that matches the template values in tools_test.py is difficult at best. For this reason, when I update template values in tools_test.py for the result, I take them from the Travis run from the PR, after very carefully making sure that the resulting changes due to the other PR elements are really doing what we want.

I think it would be a good idea to get the check.py cleanup_empty_objects CHANGE_ table fixed up first (this is a poor design, in that it doesn't get automatically fixed when objects are redesigned). Then capture the output of the test, and try your PR again, with the CHANGE_ values updated for the PR as well as the wrong media and place values. Than MAY be enough to allow a test match. I'm guessing this, as if the CHANGE_* values are wrong, the check might be incorrectly detecting empty objects. And this may affect some of the other test results.

janskarvall · 2021-05-27T09:39:03Z

I have not (yet) done any more effort in order to find out why the db test case generation is affected. At least, I am able to run setup.py in my gramps eclipse project and get exactly the same outcome as the Travis run, with one exception. ExportControl in exports_test.py fails. TIME values are 1 hour off, VERS (gramps version) and DATE (of gramps) is wrong, and some FILE paths differ. I have not (yet) bothered to investigate why. (The latest gramps commit I have in the master branch in eclipse is e3b6076... "Merge branch 'gramps51'". The "Internet tab added to XX category" branches spawns off from that commit.) Regarding the column indices in the database blobs, You may have noted that I did a half-hearted job in my "Internet tab added to XX category" PR's. I created a "column" tuple that actually replicates the information already there in the result returned by the get_schema() function. One could maybe go one step further and create the "column" tuple from the schema, e.g. by calling get_schema() and "filter out" the column names in the __init__() function. Such a "filter-out-columns or get-columns or ..." function could maybe be hosted by the PrimaryObject class. Anyway, I used that column tuple in upgrade.py. It works, but it may not be the most elegant way to do it. Python is still a rather new experience to me. To make check.py resilient to schema changes, one has to decide if one should go along the lines I took, or if one should do something better, and then apply the change to all primary objects. So I wonder what the way forward is. Is it: 1. Create a PR that fixes the column index issue. Would maybe affect PrimaryObject and all its subclasses, check.py and upgrade.py (and maybe something else not yet identified) 2. Update the "Internet tab added to XX category" PR's to match the changes made by the PR in step 1. 3. Keep a blind eye regarding Travis failing in the test_tcg_and_check_and_repair test. 4. When the PR:s chosen by you that are related to this are merged into the master branch, adjust tools_test.py to make Travis happy. Or something else? Jan

…

On 2021-05-26 16:35, Paul Culley wrote: As you have surmised, this tool is primarily testing to see if there are changes by comparing a test db to a previous run by the last developer. The db generation uses random numbers to create the objects for test and is VERY sensitive to changes. As to why, I'm not sure, perhaps some part of the random number generation is in some way content sensitive, you would have to look at how each of the object generation code items was done to see if you could find this. Note, I have also found that the check is very sensitive to settings for Gramps. When it runs in the Travis, it is running as a clean install. When run on your own systems (or mine) it may be using different settings, resulting in the random object generation producing different results. So trying to get a run result on your own system that matches the template values in tools_test.py is difficult at best. For this reason, when I update template values in tools_test.py for the result, I take them from the Travis run from the PR, after very carefully making sure that the resulting changes due to the other PR elements are really doing what we want. I think it would be a good idea to get the check.py cleanup_empty_objects CHANGE_ table fixed up first (this is a poor design, in that it doesn't get automatically fixed when objects are redesigned). Then capture the output of the test, and try your PR again, with the CHANGE_ values updated for the PR as well as the wrong media and place values. Than MAY be enough to allow a test match. I'm guessing this, as if the CHANGE_* values are wrong, the check might be incorrectly detecting empty objects. And this may affect some of the other test results. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#1200 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACV75AASILIZKIUJ5GZ6PWDTPUBJRANCNFSM43MKMNCQ>.

prculley · 2021-05-27T14:39:09Z

In my opinion, I would not at this point try to fix the column index issue for all classes. Instead I would just fix up check.py table. The reason is that we may be sometime relatively soon doing a large change to the db underpinnings; convert stored data from pickled serialized objects to json objects. If we so this, we would lose the serialized data calls to the db and have to make a lot of changes to code like check.py to deal with this. And a more general method of column indexing would become unnecessary.

I would make the check.py fix for the media and place a separate commit as part of this PR; if you made it a PR on its own, then this would become a dependent PR with potential to get held up. I would not try to satisfy Travis in that commit, but rather deal with it as part of the overall PR. That way you only have to do it once. Additional changes to that table for your new URL would be part of the overall PR commits.

In general, we need to make Travis happy before a PR gets accepted, so you may end up dealing with this more than once, depending on when your PR is accepted.

Regarding your other URL PRs, we are back to the dependency issues, depending on what gets accepted first. I might be inclined to just let them stay 'in progress' until this one is accepted and then fix them up to work, after rebasing them on the master commits that result from this PR.

There are a couple of other significant PRs (place updates, and UIDs) that will also have dependency issues like this; whichever gets accepted first will make a bunch of work for the others to make sure that it all comes out correctly.

janskarvall · 2021-05-27T19:31:42Z

Are You saying that column indexing will be replaced by dictionaries, i.e. key:value pairs in the future? Are You also saying that I, for each of the five "Internet tab added to XX category" PR's should: 1. On top of what I have in those branches make a commit (A) that: * changes the value of CHANGE_PLACE from 11 to 15 in check.py * changes the value of CHANGE_XX to its correct valuein check.py 2. On top of that commit make a commit (B) that (Should be numbered 2. Thunderbird problem :( ) * changes tools_test.py lines 139-159 to match with the Travis output. Then, whenever Travis signals that the test_tcg_and_check_and_repair fails for any of the "Internet tab added to XX category" PR's: 1. Revert the (B) commit (have never done so - maybe it would be a Hard Reset?) 2. Change: * tools_test.py: Change lines 139-159 to match with the Travis output * check.py: Change CHANGE_YY for any YY that have moved into the master branch 3. Make a new commit (B') That would sum up to making 15 (B) commits (5 + 4 + 3 + 2 + 1). Puh ... Jan

…

On 2021-05-27 16:39, Paul Culley wrote: In my opinion, I would not at this point try to fix the column index issue for all classes. Instead I would just fix up check.py table. The reason is that we may be sometime relatively soon doing a large change to the db underpinnings; convert stored data from pickled serialized objects to json objects. If we so this, we would lose the serialized data calls to the db and have to make a lot of changes to code like check.py to deal with this. And a more general method of column indexing would become unnecessary. I would make the check.py fix for the media and place a separate commit as part of this PR; if you made it a PR on its own, then this would become a dependent PR with potential to get held up. I would not try to satisfy Travis in that commit, but rather deal with it as part of the overall PR. That way you only have to do it once. Additional changes to that table for your new URL would be part of the overall PR commits. In general, we need to make Travis happy before a PR gets accepted, so you may end up dealing with this more than once, depending on when your PR is accepted. Regarding your other URL PRs, we are back to the dependency issues, depending on what gets accepted first. I might be inclined to just let them stay 'in progress' until this one is accepted and then fix them up to work, after rebasing them on the master commits that result from this PR. There are a couple of other significant PRs (place updates, and UIDs) that will also have dependency issues like this; whichever gets accepted first will make a bunch of work for the others to make sure that it all comes out correctly. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#1200 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACV75AGECZXPIT674HYKJWLTPZKR7ANCNFSM43MKMNCQ>.

janskarvall · 2021-06-05T07:25:49Z

I believe that the PR now passes the tests.
If so, what would be my next step with the other "Internet tab added to XX category" PR's?

Should I wait for this one to be merged, then update one of the others? Conflicts will occur on upgrade.py, check.py, and possibly tools_test.py whenever any of them are merged, so I need to rebase the PR branches.

Jan

prculley · 2021-06-05T14:13:26Z

If it was me, I would wait. Less work, avoiding the merge conflicts multiple times etc. And if there are any other concerns with the way this is done when Nick H reviews, you could make appropriate changes to the others to match.

PushKK · 2021-06-07T18:04:31Z

Travis say for you:

gramps/gen/db/upgrade.py:53:    
gramps/gen/db/upgrade.py:61:    
gramps/gen/db/upgrade.py:66:    
gramps/gen/db/upgrade.py:68:    
gramps/gen/db/upgrade.py:82:                 
gramps/gui/editors/editmedia.py:219:        
gramps/plugins/importer/importxml.py:1587:             self.object.add_url(url)           
gramps/plugins/webreport/media.py:615:                
ERROR - Trailing whitespace found in source file(s)

Summary:

gramps/gen/db/upgrade.py:
You must delete whitespaces on row 53 (4 whitespaces = 1 tab).
May be rows 60, 65, 67 and 81 have whitespaces before '#' or 'from'.

gramps/plugins/importer/importxml.py:
Row 1587 have whitespaces on end:
" self.object.add_url(url) "

After change every file Travis run check your code again.
May be you can change your code on this PR on tab 'Files Changed'.

Nick-Hall · 2021-06-07T21:15:07Z

I agree with adding urls to all primary objects except for sources and citations. I think that they should be handled slightly differently because an url actually forms part of a citation.

Having said that, I'm about to review all pull requests in light of the release of GEDCOM 7.0 today.

SNoiraud · 2021-06-08T12:02:30Z

I agree with adding urls to all primary objects except for sources and citations.

I don't agree for citations. We can have url associated:
For a source: https://recherche.archives.morbihan.fr/ark:/15049/vta5448859ac193c/daogrp/0/1#id:1177128651?gallery=true&brightness=100.00&contrast=100.00&center=2089.485,-1351.836&zoom=4&rotation=0.000
For a citation: https://recherche.archives.morbihan.fr/ark:/15049/vta544882252c093/daogrp/0/1#id:600358297?gallery=true&center=2077.599,-1343.327&zoom=12&rotation=0.000&brightness=100.00&contrast=100.00

Nick-Hall · 2021-06-08T17:53:24Z

@SNoiraud Sorry, I didn't make myself very clear. I just think that they need to be handled slightly differently for citations.

Typically, a citation will only have one url and it will form part of the formatted citation, together with the date that it was accessed. Both these fields should probably be available in the main tab of citation editor. If we used attributes to record such information we could define a special attribute to provide a formatted citation. For example, an attribute called "format" could contain the string "{page} ({url}, accessed: {date})".

In GEDCOM 7.0, the recommendation seems to be to store such information in the PAGE tag.

janskarvall · 2021-06-09T09:26:22Z

I am on thin ice now, but I wonder if it could the case that GEDCOM 7.0 allows media (MULTIMEDIA_RECORD) to be a web-accessible file e.g. in the form of text/html. If gramps would allow that, then one could put web links in the media galleries rather than under a separate Internet tab. And, in GEDCOM 7.0, FAMILY_RECORD, INDIVIDUAL_RECORD, SOURCE_RECORD, SOURCE_CITATION, EVENT_DETAIL may contain MULTIMEDIA_LINK's. PLACE_STRUCTURE however lacks MULTIMEDIA_LINK's. Jan

…

On 2021-06-08 19:53, Nick Hall wrote: @SNoiraud <https://github.com/SNoiraud> Sorry, I didn't make myself very clear. I just think that they need to be handled slightly differently for citations. Typically, a citation will only have one url and it will form part of the formatted citation, together with the date that it was accessed. Both these fields should probably be available in the main tab of citation editor. If we used attributes to record such information we could define a special attribute to provide a formatted citation. For example, an attribute called "format" could contain the string "{page} ({url}, accessed: {date})". In GEDCOM 7.0, the recommendation seems to be to store such information in the PAGE tag. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#1200 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACV75ADPBLGERKFPQ37CCR3TRZKKFANCNFSM43MKMNCQ>.

prculley · 2021-06-09T16:05:14Z

The GEDCOM 7.0 spec does allow URLs as the source of the file for multimedia. So does Gramps, theoretically, although it only works in pretty special circumstances.

Nick-Hall · 2023-06-28T22:29:10Z

Closed in favour of implementing web-accessible file references in media objects. See the Gedcom 7.0 FILE tag specification.

Nick-Hall added the enhancement label May 21, 2021

Internet tab added to Media category.

7e6ad89

janskarvall force-pushed the inttab_media branch from e4dbaf6 to 20dbcda Compare May 31, 2021 19:55

Fixed faulty change column values and expected test outcome.

bccb6ed

janskarvall force-pushed the inttab_media branch from 20dbcda to bccb6ed Compare June 8, 2021 19:31

Nick-Hall closed this Jun 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Internet tab added to Media category. #1200

Internet tab added to Media category. #1200

janskarvall commented Apr 22, 2021

Nick-Hall commented May 21, 2021

janskarvall commented May 25, 2021 via email

prculley commented May 26, 2021

janskarvall commented May 27, 2021 via email

prculley commented May 27, 2021

janskarvall commented May 27, 2021 via email

janskarvall commented Jun 5, 2021

prculley commented Jun 5, 2021

PushKK commented Jun 7, 2021 •

edited

Nick-Hall commented Jun 7, 2021

SNoiraud commented Jun 8, 2021

Nick-Hall commented Jun 8, 2021

janskarvall commented Jun 9, 2021 via email

prculley commented Jun 9, 2021

Nick-Hall commented Jun 28, 2023

Internet tab added to Media category. #1200

Internet tab added to Media category. #1200

Conversation

janskarvall commented Apr 22, 2021

Nick-Hall commented May 21, 2021

janskarvall commented May 25, 2021 via email

prculley commented May 26, 2021

janskarvall commented May 27, 2021 via email

prculley commented May 27, 2021

janskarvall commented May 27, 2021 via email

janskarvall commented Jun 5, 2021

prculley commented Jun 5, 2021

PushKK commented Jun 7, 2021 • edited

Nick-Hall commented Jun 7, 2021

SNoiraud commented Jun 8, 2021

Nick-Hall commented Jun 8, 2021

janskarvall commented Jun 9, 2021 via email

prculley commented Jun 9, 2021

Nick-Hall commented Jun 28, 2023

PushKK commented Jun 7, 2021 •

edited