Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Req. / Enhancement] Restructure contact fields #4139

Closed
MaxWebmecanik opened this issue May 24, 2017 · 11 comments
Closed

[Feature Req. / Enhancement] Restructure contact fields #4139

MaxWebmecanik opened this issue May 24, 2017 · 11 comments
Labels
feature

Comments

@MaxWebmecanik
Copy link
Contributor

MaxWebmecanik commented May 24, 2017

Q A
Bug report?
Feature request?
Enhancement?

Description:

Today, when a contact field is created, Mautic does two things:

  1. It creates a entry in the lead_fields table containing the metadata of that field
  2. It adds a columns to the leads table which will then contain the actual contents of that field

The problem today is that MySQL tables are limited in width by the very nature of how RDBMSs work. This means that adding too custom fields, especially text fields which are stored as VARCHAR(255) results in problems between what Doctrine thinks should be in the database and the actual schema (see #3202 (comment) and #3202).

Some possible quick fixes:

  • Limiting the number of custom fields the user can create
  • Store custom fields in narrower data types such as TEXT instead of VARCHAR(255)

Neither of these would adresse the real issue with the current structure, custom fields are limited.
This is especially problematic for CRM users who want to find all of their data in both their marketing automation solution and CRM solution.

Therefore, I think the only way forward is to drastically restructure contact fields. There is already some structure in place with the lead_fields table. I think what we would need to do would be to create a lead_fields_leads_xref table which would have the following structure

Lead ID 🔑 Lead Field ID 🔑 Content (maybe some metadata columns...)
42 (My contact ID) 3 (First name field ID) Maximilien
... ... ...
42 (Still my contact ID 1337 (Salesforce opportunity closing date field ID) 24/05/2017
... ... ...

This would allow (practically) infinite custom fields.

@alanhartless, @matishaladiwala I would love to have your feedback on this issue!

@escopecz escopecz added the feature label May 24, 2017
@MaxWebmecanik
Copy link
Contributor Author

MaxWebmecanik commented May 30, 2017

Hi everyone, this morning, we gathered with our devs to talk about how we could make this happen.

Our conclusion was the following:

Challenges

A few challenges arise from that kind of structure.

Datatypes

The biggest of which being how do we handle different datatypes in the same kind of structure efficiently rather than storing everything in a LONGTEXT column and casting types.

Our answer to that would be to fragment the lead_fields_leads_xref table above by datatypes into mutiple tables of the same structure as the one above but with the Content column of the type designated by the table.

For example, the bool_lead_fields_leads_xref table would have its Content column of type TINYINT where as the text_lead_fields_leads_xref table would have its Content column of type LONGTEXT

This would have the following advantages:

  • Native MySQL types are more efficient and will allow better indexing
  • A few smaller tables will be much quicker to search through in the long run that one general all-purpose table

Inserts

An issue that would now stem from fracturing the database is the number of inserts we need to do for every new lead.
We realized that it would be extremely costly to insert into 3 ~ 5 different tables for each new anonymous lead visiting a page, therefore, we cannot store NULL values for every field for every contact. Rather, we should rely on the EXISTance of an entry in these link tables to tell whether or not that field is null for any given lead.

Native vs. Custom fields

For fields such as email, we cannot afford to move that data away from the central lead table as it is a crucial piece of information used throughout the application. Here are the other fields that I worry about but am not sure whether or not we should give them special (native) treatment:

  • Points
  • Company
  • Attribution
  • Attribution Date
  • Preferred Locale

If we do make such a schism (which I think we should), we would then need to indicate this difference in one of two different ways:

  • Either remove these fields from the lead_fields table which would then only contain the custom fields (but I think this would introduce backwards incompatible changes)
  • Or add a is_native column to the lead_fields table that would indicate whether their value is held by the <TYPE>_ lead_fields_leads_xref table or the leads table itself

Migrating

After this evolution, the migration process will obviously be difficult.
This is something we need to discuss and I raise the following questions:

  • How should we handle a migration that fails during this schema overhaul?
  • Should we keep the original structure and shift leads to the new schema as they are added / modified?
  • Do you have any suggestions for testing this as thoroughly as possible?

@alanhartless
Copy link
Contributor

alanhartless commented May 30, 2017

Back when we first started coding Mautic, we took a similar structure. The issue came with segments where the queries became so complex, it was ridiculous and so we switched to what is now. But we've ran into the same issues with our bigger customers and have candidly discussed similar strategy but because it would require a huge change and not only schema structure, but pretty much everywhere Mautic deals with custom fields, it would have to come with a major Mautic release. This will be a huge BC break that will affect a lot of code.

@heathdutton
Copy link
Member

heathdutton commented Jan 25, 2018

Instead of a complete break, why not take a smoother transition by allowing fields to be "extended" upon creation as an option? Most installations will be fine with all the fields in one table. Other installations need to add hundreds/thousands of custom fields. We could allow both, as the situation demands, as a bridge to whatever pattern is used in future version of Mautic.

Something like:
Standard, Extended, Secure

  • Standard fields: Shows fields remaining that can be added. Option is removed if the number exceeds the limit, or if the count of contacts exceeds 1M (to reduce likelihood of outages at load due to table alteration).
  • Extended fields: New fields are created as xref tables, based on type, etc.
  • Secure fields: Same as extended fields, but the tables include the word "secure" in the name. This would make it easy for me to keep the DB HIPAA compliant by excluding access to such tables by certain parties, while giving access to others to reporting. (I'm obviously trying to kill a few birds with one stone here, and this is outside the scope of the request. But it'd be nice).

@hhhc
Copy link

hhhc commented Jan 30, 2018

Hi, I have also come across this issue when trying to add custom fields. Using Mautic as a customer data platform, we planned to have 400-600 attributes. When playing around with it, another short-term solution could be to optimize the varchar lengths of the fields.

The standard fields like firstname, lastname are all varchar(255) but most likely would not need to be that big.
On custom text fields, a length indicator could permit that the length of the varchar could be set during creation.

I have changed core for testing reasons and these are my numbers:
./mautic/2.12.1/vendor/doctrine/dbal/lib/Doctrine/DBAL/Platforms/MySqlPlatform.php
In function getVarcharTypeDeclarationSQLSnippet hardcode the length to 100/40 to avoid an error "Row size too large. The maximum row size for the used table type, not counting BLOBs, is 65535."

Setting length to 255 (standard) allowed me to have 106 text custom fields.
Setting length to 100 allowed me to have 145 text custom fields.
Setting length to 40 allowed me to have 164 text custom fields.

Although I know that this does not solve all issues for everybody it still could improve the situation for some.

@heathdutton
Copy link
Member

heathdutton commented Feb 5, 2018

If anyone is interested, we're started working on a plugin solution that works similar to the "Company" model, but divides out the fields by type into xref tables: https://github.com/TheDMSGroup/mautic-extended-fields
Still a WIP but should be ready for production soon.

@dbhurley
Copy link
Member

dbhurley commented Feb 25, 2018

In order to improve engagement and increase visibility to this feature request we are closing this issue on GitHub and recommending all Feature Requests (without code attachment) be placed in the Ideas forum on our community site.

We recommend you post your ideas to Mautic's forum so others can comment on them and help turn your ideas into a proof of concept and eventually an exciting new feature in Mautic.

More information about this move can be found in our blog announcement.

@heathdutton
Copy link
Member

heathdutton commented Mar 22, 2018

We're using the plugin above in production, even though it's not to 1.0 yet. But we're wondering if it would be a good idea to create a PR to add this functionality to core. It would end up being a much cleaner implementation (no compiler pass, overrides), etc. Just not sure how the community would feel about it?

@hhhc
Copy link

hhhc commented Mar 27, 2018

Having the flexibility to choose whether to store values as today or adding hundreds of fields in the extended format would be of great benefit. Also, the API would be nicely integrated (which is a huge point from my perspective).

+1

@albanleandri
Copy link

albanleandri commented Jun 2, 2018

Is it planned to do anything about that in M3 to overcome this limitation? (Above is said it requires a major release, because of BC troubles).

Something I would like to say is that the motivation why people tend to add so many custom fields should be understood first so that the solution is adapted. Obviously, I can't talk for others Mauticians needs regarding custom fields, but I feel that I am creating 80% of my custom fields just for the following reason:
Linking form submission field results and campaign email seems only possible with customs fields in Mautic. Being able to personalize emails w/ leads data is a table-stake feature in Marketing automation platforms such as Hubspot, or IBM's Watson Campaign Automation.

For example, for sending "delayed" abandonment email to leads, and incorporate within it some shopping cart data (that would be captured by submitting forms by Mautic API since Mautic doesn't support custom triggers/events, as I know), I guess it would help if we could use in email templates token similar as {formfield=....} WHENEVER the campaign email ("Send email" action) is placed right after a "form submitted" decision in the campaign (Imagine a new action called "Send email w/ form fields" )
It would somehow alleviate the need to create a new custom field (a column for ALL leads).

@heathdutton
Copy link
Member

heathdutton commented Jun 2, 2018

@AlbanL74fr Our team wants to take the Extended Field plugin and rewrite it as a PR to core 2.x so that new fields can optionally be "extended" so that there'd be no limit on the number of fields. In a way it could ensure that 3.x will also support an unlimited field model because the bridge will have already been crossed by then (effectively proving it can be done).

A quick test with ~3 million leads and 355 extended fields (using the plugin w/ 2.12.x):

  • The typical query to pull all field values for a lead is about ~300ms for the first fetch (using a big union query by indexed primary foreign keys).
  • Subsequent queries of the same lead are ~3ms (thanks MySQL).
  • Segment updates are about the same as they'd be for core custom fields (~20% slower, but hard to determine since you can't really have this many core custom fields). This doesn't really affect us since it's all done by cron.

It's a pretty good outcome. Ultimately this means we can delete a custom field in production, without an outage while a deletion of a column occurs, and we can add custom fields to our heart's content for special campaign/client needs.

The plugin code obviously needs refinement and rewriting for core (it was our first plugin, and we've learned a lot since then). But I think we could make a decent PR now. The question is if the core team would accept it?

What do you think @alanhartless? Yes those queries get complex, but they work ¯\(ツ)

@mautibot
Copy link

mautibot commented Feb 29, 2020

This issue has been mentioned on Mautic Community Forums. There might be relevant details there:

https://forum.mautic.org/t/you-have-reached-the-limit-of-custom-field-allowed-in-your-database/13147/2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature
Projects
None yet
Development

No branches or pull requests

8 participants