Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No random characters at the end of database name #34

Open
jaapjansma opened this issue Dec 16, 2015 · 9 comments
Open

No random characters at the end of database name #34

jaapjansma opened this issue Dec 16, 2015 · 9 comments

Comments

@jaapjansma
Copy link

Is it possible to omit the random characters of the database name created by amp? I have a few database installed with amp and it is hard to remember each random suffix to use in mysql workbench during development and in my case I have made sure that the prefix is unique, e.g. contains a project name

@totten
Copy link
Collaborator

totten commented Dec 16, 2015

It would require patching. The random characters are appended at:

https://github.com/totten/amp/blob/master/src/Amp/Database/MySQL.php#L55

FWIW, in early revisions, it didn't append the random chars, and I ran into a few situations where:

  • The project directory names were unique but too long. (MySQL imposes a 16 character limit on usernames, and amp matches the DB+username.) If the names follow a formula (wordpress_demo_4_5_civi and wordpress_demo_4_5_cms) and if you truncate the names, then they're no longer unique.
  • The web-root was placed in a subdirectory. (Ex: In Symfony apps, the web-root is conventionally my-project/web. In projects like https://github.com/civicrm/civicrm-website-org , the web-root can be my-project/drupal or my-project/public.) In this case, I'd get multiple projects named web or public.

Agree that it's possible to arrange for unique project names, and I can imagine some tooling which makes the random names inconvenient. (Extra comments below.) But also think it should work 'out-of-the-box' with those two situations.

Possible compromises:

  • Make the random characters optional. Maybe declare a parameter in app/default/services.yml; by default, the option enables random suffixes, but you can disable it by editing ~/.amp/services.yml or calling amp config:set. (For a similar example, note how mysql_dsn works in services.yml and Amp/Database/MySQL.php)
  • In Amp/Database/MySQL, you could cut down the amount of randomness. (Just append one random letter or two random digits.) This makes it easier to remember but increases the risk of collisions, so then createDatasource() would need to test for collisions (and retry until it finds a free value).

It's curious that I never notice issues with randomish names - may be a difference in tools or workflows? In case it helps, here are some things in my workflow which may be mitigating factors:

  • Project names tend to be pretty short (d46, dmaster, wpmaster), so the DB names are still distinctive in reading DB list.
  • In MySQL CLI, one can tab-complete DB names.
  • I generally open MySQL CLI by calling something like amp sql -Ncms, amp sql -Ncivi, or drush sqlc. As long as the command runs in the right directory, you don't need to know the DB credentials.

@jaapjansma
Copy link
Author

I was thinking of making the randomness optional and if not set enabled by default. And looking from the integration part making it a variable in civibuild which one could set in the configuration. I have directory http whith sub directories such as civirules, civirules46, pum, maf, sp, dgw, banking etc.. all of them are unique.

It is not a major issue. But the solution sounds a bit too complicated to develop in a few minutes

@xurizaemon
Copy link

Current output means that a site named "wordpress" ends up with two unclear DB names (wordpressc_7t4my, wordpressc_lph8v) which removes clarity. Shifting cms/crm to the right side of the underscore would solve this trivially (wordpress_cms7t4my, wordpress_crmlph8v), but still leave potential for confusion when there are multiple sites of similar / same names.

MySQL DB names can be up to 64 char long, the current cut off could be longer - there's a subjective balance between "how long it can possibly be" and "how long are DB names I like to read" :)

Using truncation + a sequence might be clearer - in practice (Aegir) this still feels ugly when cutting off at 16 char, but at least I can tell which DB is the most likely candidate. Aegir's Provision_Service_db::suggest_db_name(), for reference:

/**
 * Find a viable database name, based on the site's uri.
 */ 
function suggest_db_name() {
  $uri = $this->context->uri;

  $suggest_base = substr(str_replace(array('.', '-'), '' , preg_replace('/^www\./', '', $uri)), 0, 16);

  if (!$this->database_exists($suggest_base)) {
    return $suggest_base;
  }

  for ($i = 0; $i < 100; $i++) {
    $option = sprintf("%s_%d", substr($suggest_base, 0, 15 - strlen( (string) $i) ), $i);
    if (!$this->database_exists($option)) {
      return $option;
    }
  }

  drush_set_error('PROVISION_CREATE_DB_FAILED', dt("Could not find a free database names after 100 attempts"));
  return false;
}

@jaapjansma
Copy link
Author

@totten what do you think? Should we follow @xurizaemon and implement a sequence number rather than a randomstring at the end?

@totten
Copy link
Collaborator

totten commented Jan 7, 2016

I just don't feel as confident in strictly-sequential naming. That's not to say the naming itself won't work or must cause problems -- but that it puts more pressure on other parts of the system and creates "if's" and "but's".

  • When launching nightly tests on several branches, it needs to spin up several DBs at once. Similarly, when launching a multithreaded test, it needs to spin up several DBs at once. With concurrent operation, you really need locking for the ID generator to work. Early versions of amp didn't haven't locking, but the current one does have a lock (based on files+PIDs). But that only works on single-host deployments.
  • That suggest_db_name() algorithm makes it more likely that a DB name will be reused over time. (If you drop DB foo_2, and then foo_2 will be used for the next site.) That reuse can be dangerous if there are stale references. (In my head, there's a big data-loss scenario, but it takes a specific sequence of actions.)

Granted, for most local devs, these situations are unlikely/rare. But when these edge-cases come up, we have to deal with the fallout/debugging. Random names just feel safer as a default.

If you want to add an option where one can opt-in to a different naming pattern, that sounds fine.

@xurizaemon
Copy link

xurizaemon commented Apr 22, 2017

@ejegg @awight It looks like your branch ejegg/amp@precreated is working on a similar approach here ... If so, hopefully this is an OK place to ask: what's an example PRECREATED_DSN_PATTERN, and where are you configuring it?

@ejegg
Copy link
Contributor

ejegg commented Apr 22, 2017

@xurizaemon the motivation for the precreated type was so we could run a minimal script as db admin to create the databases, then have amp run as a normal db user. Here are the scripts that run in our CI environment:
https://github.com/wikimedia/wikimedia-fundraising-crm/blob/master/bin/ci-settings.sh
https://github.com/wikimedia/wikimedia-fundraising-crm/blob/master/bin/ci-create-dbs.sh
https://github.com/wikimedia/wikimedia-fundraising-crm/blob/master/bin/ci-populate-dbs.sh

So the prefix is unique to the Jenkins job ID, and we get to test with drupal tables and civi tables in separate databases, to catch any tricky cross-db gotchas.

PRECREATED_DSN_PATTERN="mysql://${CIVICRM_MYSQL_USERNAME}:${CIVICRM_MYSQL_PASSWORD}@${CIVICRM_MYSQL_CLIENT}/${CIVICRM_SCHEMA_PREFIX}{{db_seq}}"

The {{db_seq}} is the only part that the precreated db provider fills in - the rest of those shell vars are set in ci-settings.sh

@eileenmcnaughton
Copy link
Contributor

@totten just revisiting this - what I'm looking for (not in the jenkins enviroment per ^^) is to be able to do

env CMS_DB_NAME=superspecialname civibuild create dmaster

I thought it must be possible since we allow various other things to be set or passed in - but db name just doesn't seem possible

@eileenmcnaughton
Copy link
Contributor

(I did previously do it by altering instances.yml but there are persistence issues if I do a civibuild destroy)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants