Add handling of `usage` to `DAO` generator #25874

eileenmcnaughton · 2023-03-20T00:42:34Z

Overview

Add handling of usage to DAO generator

Before

The use of the DAO field import is overloaded to imply import, export, usable for dedupe matching & a whole bunch of other things I haven't get figured out...

After

new
usage key on fields.

During generation it will ALWAYS be added to the DAOs for each field like

  'usage' => [
     'import' => FALSE,
      'export' => TRUE,
      'duplicate_matching' => FALSE,
      'token' => FALSE,
 ],

If it is not in the xml it will make explicit what was implied anyway - ie

export is TRUE if export or IMPORT is true
import is TRUE if import is true
duplicate_match is TRUE if import is true

Technical Details

I have only included one regenerated DAO with 1 explicit field & the rest 'determined'. I would propose to do the rest of the DAOs as a follow up PR once merged (xmls only need to be updated as needed)
I explicity declared usage for each field. I prefer this explicitness but the counter argument is more file size for tar-balling or more use in memory / serializing
This doesn't add any usage - any potential usage should use in a way that falls back to the key not being there (not necessarily in a notice-free way)
it would be nice if sites could override these the way they do settings - ie easy for a site admin without an extension
I kinda wonder if we want to allow for more nuance later ie

  'usage' => [
     'import' => ['is_available' => FALSE],
      'export' =>  ['is_available' => FALSE], 
     'duplicate_matching' =>  ['is_available' => FALSE],
 ],

Comments

@colemanw as discussed on chat

civibot · 2023-03-20T00:42:37Z

(Standard links)

If you are reviewing this pull-request, you may wish to consult the test sites and the Review Standards (long template, short template).

eileenmcnaughton · 2023-03-20T00:49:34Z

test this please

eileenmcnaughton · 2023-03-20T11:13:10Z

@colemanw I just hit another argument for YAPOM - the import code uses DAO_Import::fields() & then parses to re-format into apiv4 field names (currently unique names but the goal was custom fields too) - because I can't filter apiv4 getFields for import

colemanw · 2023-03-20T13:31:03Z

I like the direction of this. Wish I'd thought of it years ago. As it stands, there's going to be a lot of fudging to accommodate old daos without these flags, but eventually we can add noisy deprecations.

<import> is one of the most scary/mysterious pieces of metadata in the schema so I've never touched it. It's mighty brave of you to do so @eileenmcnaughton

eileenmcnaughton · 2023-03-20T19:41:24Z

@colemanw do you have any thoughts on the second & last point in the technical details

colemanw · 2023-03-20T19:44:55Z

@eileenmcnaughton I agree with the 2nd point, because not including it for every field leads to ambiguity (does missing data == false or does it mean the dao needs to be regenerated?)

To the last point, I prefer the flatter structure you've got now. We can always add more keys to the array, but I think nesting in more arrays is overkill.

eileenmcnaughton · 2023-03-20T20:08:57Z

@colemanw OK - I can't think of a use-case for more nuance at the moment - but it's more than transforming from a bool to an array latter is tricky - but as you say there could be new array entries

seamuslee001 · 2023-03-20T20:42:45Z

CRM/Core/CodeGen/Specification.php

+    if ($usage['import'] === 'TRUE') {
+      $field['import'] = $usage['import'];
+    }
+    if ($usage['export'] === 'TRUE') {
+      $field['export'] = $usage['export'];


Suggested change

if ($usage['import'] === 'TRUE') {

$field['import'] = $usage['import'];

}

if ($usage['export'] === 'TRUE') {

$field['export'] = $usage['export'];

$field['import'] = $usage['import'] === 'TRUE' ? $usage['import'] : NULL;

$field['export'] = $usage['export'] === 'TRUE' ? $usage['export'] : NULL;

eileenmcnaughton · 2023-03-21T01:29:28Z

@seamuslee001 @colemanw it has passed!

The follow on steps from here go in a few different directions - fixing the api to return & preferably filter on usage (I probably need some help from @colemanw there) and fixing some of the DAO functions like importableFields to use usage along with metadata cleanup & DAO regeneration - but I think we should merge this one as is first since those spider web off in different directions

eileenmcnaughton · 2023-03-21T01:37:33Z

xml/schema/Core/Address.xml

+    <usage>
+      <import>false</import>
+      <export>true</export>
+      <duplicate_matching>false</duplicate_matching>


I don't want to have to set tests going again now - but will add token to this when I do a metadata pass

totten · 2023-03-21T07:51:09Z

Overall, I also like the concept here. Definitely think it's better to split into a separate subkey.

(@eileenmcnaughton) I kinda wonder if we want to allow for more nuance later ie

(@colemanw) I prefer the flatter structure you've got now. We can always add more keys to the array,

+1 for the flatter structure.

I explicity declared usage for each field. I prefer this explicitness but the counter argument is more file size for tar-balling or more use in memory / serializing

Yeah, the tarball... I wouldn't worry about that. For the memory, let's ballpark it and see how we feel.

Model size
- There are ~1,800 fields in the database (per git grep '<field>' xml/schema/ | wc) spread across ~180 entities. That means average of ~10 fields per entity.
- I hacked my CRM_Core_ClassLoader to log every time it loads a _DAO_ class, and then I clicked around a few screens (login, view dashboard, search participants, edit a contact, etc). With 48 HTTP requests, the average (mean) was to load 55 DAOs.
Size of usage metadata
- Serialized Measure: I copied the usage from one of the Address.php fields, and it weighs in at 80-100 bytes (in json or serialize format). I'm not certain how it looks with PHP opcodes, but it's probably a similar ballpark. Call it 100 bytes.
- Theoretical Measure: Suppose you encoded with a more compact format. There are 4 flags. Each flag could be mapped to a 1-byte value (number/letter), and you test flags with strpos()!==FALSE. That's up to 4 bytes. But add some room for growth. Call it 10 bytes.
- Empirical Measure: But then I tried a quick php-cli call to load CRM/Core/DAO/Address.php. Per memory_get_usage(), it now requires 19kb more. This was consistent across multiple calls. It has 29 fields. That means actual memory usage grows by 650b per field.
  - It does make sense for this to be larger -- e.g. you read the PHP file; generate opcodes; then run the opcodes. The extra usage gets reflected in multiple ways.
  - An even better measurement would be to update 10 DAOs and then measure memory_get_usage() in php-fpm SAPI. But I'm getting lazy...

Putting those numbers together, the ballpark looks like this:

	Full Model	Average Model
Description	(e.g. build full metadata cache)	(e.g. average HTTP request)
Entities	180	55
Fields	1800	550
Serialized Size (100b)	180kb	55kb
Theoretical Size (10b)	18kb	6kb
Empirical Size (650b)	1.1mb	350kb

I think you're right to ask the question, but I'm not really sure the answer. At 18kb -- no problem. At 350kb or 1mb, hard to say. I'm not sure how to formulate a cut-off.

eileenmcnaughton · 2023-03-21T09:39:06Z

@totten is there anything to be read into this test run seeming to take a bit longer than some other test runs (I looked at 2 others & they took less time)

totten · 2023-03-21T21:14:44Z

(@eileenmcnaughton) is there anything to be read into this test run seeming to take a bit longer than some other test runs (I looked at 2 others & they took less time)

Not that I can see. There's some natural variation due to hardware differences, overall load, etc -- and (eg) the recent runs of phpunit-crm and phpunit-api3 landed on mid-tier (bknix-XXXXX) and low-tier (rabul-t450s) hardware, and the timings were not unusual for them. If there is a performance impact in the patch, then it's not dramatic. (It's small enough that one would need multiple/controlled runs to detect it.)

colemanw · 2023-03-22T12:13:43Z

Ok, I like the proposed structure so will merge this so @eileenmcnaughton can continue progressing on it.

Regenerate DAOs with usage from #25874

civibot bot added the master label Mar 20, 2023

eileenmcnaughton force-pushed the usage branch from 5611a3b to 929138e Compare March 20, 2023 00:46

eileenmcnaughton force-pushed the usage branch 2 times, most recently from b382d89 to 622955f Compare March 20, 2023 11:11

eileenmcnaughton force-pushed the usage branch from 622955f to d5a7af3 Compare March 20, 2023 19:45

seamuslee001 reviewed Mar 20, 2023

View reviewed changes

eileenmcnaughton force-pushed the usage branch 4 times, most recently from 820cd94 to a582b95 Compare March 20, 2023 23:35

Add handling of usage to DAO generator

0605e18

eileenmcnaughton force-pushed the usage branch from a582b95 to 0605e18 Compare March 21, 2023 00:23

eileenmcnaughton commented Mar 21, 2023

View reviewed changes

colemanw merged commit ab906d7 into civicrm:master Mar 22, 2023

colemanw deleted the usage branch March 22, 2023 12:13

colemanw added a commit to colemanw/civicrm-core that referenced this pull request Mar 27, 2023

Regenerate DAOs with usage from civicrm#25874

2db10c8

eileenmcnaughton added a commit that referenced this pull request Mar 27, 2023

Merge pull request #25934 from colemanw/usageGen

cbf6389

Regenerate DAOs with usage from #25874

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add handling of `usage` to `DAO` generator #25874

Add handling of `usage` to `DAO` generator #25874

eileenmcnaughton commented Mar 20, 2023 •

edited

civibot bot commented Mar 20, 2023

eileenmcnaughton commented Mar 20, 2023

eileenmcnaughton commented Mar 20, 2023

colemanw commented Mar 20, 2023

eileenmcnaughton commented Mar 20, 2023

colemanw commented Mar 20, 2023

eileenmcnaughton commented Mar 20, 2023

seamuslee001 Mar 20, 2023

eileenmcnaughton commented Mar 21, 2023

eileenmcnaughton Mar 21, 2023

totten commented Mar 21, 2023

eileenmcnaughton commented Mar 21, 2023

totten commented Mar 21, 2023 •

edited

colemanw commented Mar 22, 2023

Add handling of usage to DAO generator #25874

Add handling of usage to DAO generator #25874

Conversation

eileenmcnaughton commented Mar 20, 2023 • edited

Overview

Before

After

Technical Details

Comments

civibot bot commented Mar 20, 2023

eileenmcnaughton commented Mar 20, 2023

eileenmcnaughton commented Mar 20, 2023

colemanw commented Mar 20, 2023

eileenmcnaughton commented Mar 20, 2023

colemanw commented Mar 20, 2023

eileenmcnaughton commented Mar 20, 2023

seamuslee001 Mar 20, 2023

Choose a reason for hiding this comment

eileenmcnaughton commented Mar 21, 2023

eileenmcnaughton Mar 21, 2023

Choose a reason for hiding this comment

totten commented Mar 21, 2023

eileenmcnaughton commented Mar 21, 2023

totten commented Mar 21, 2023 • edited

colemanw commented Mar 22, 2023

Add handling of `usage` to `DAO` generator #25874

Add handling of `usage` to `DAO` generator #25874

eileenmcnaughton commented Mar 20, 2023 •

edited

totten commented Mar 21, 2023 •

edited