Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Make .meltano folder path configurable via new MELTANO_SYS_DIR_ROOT env var #6628

Merged
merged 21 commits into from
Oct 20, 2022

Conversation

rawwar
Copy link
Contributor

@rawwar rawwar commented Aug 14, 2022

draft PR for #6619

@netlify
Copy link

netlify bot commented Aug 14, 2022

👷 Deploy request for meltano pending review.

Visit the deploys page to approve it

Name Link
🔨 Latest commit ea199f0

@aaronsteers aaronsteers linked an issue Aug 15, 2022 that may be closed by this pull request
@aaronsteers
Copy link
Contributor

aaronsteers commented Aug 15, 2022

@rawwar - Thanks for this contribution! For my part, the implementation looks pretty clean. I'll let @pandemicsyn take the official review though. I've approved CI tests to run on this PR. Feel free to move out of "Draft" status when you see tests passing.

@pandemicsyn, @edgarrmondragon, @tayloramurphy - The only things I see to pause on here is the naming of the environment variable - and whether this needs to be a setting or if a env var parse is fine.

Some options regarding possible names/spellings:

  • DOT_MELTANO_ROOT
  • MELTANO_TEMP_DIR
    • MELTANO_TEMP_DIR_ROOT
  • MELTANO_SYSTEM_DIR
    • MELTANO_SYSTEM_ROOT
    • MELTANO_SYSDIR
    • MELTANO_SYSDIR_ROOT
    • MELTANO_SYS_DIR_ROOT

I think I like following the convention of all env vars starting with MELTANO_ so I would personally lean towards something like MELTANO_TEMPDIR or MELTANO_SYSTEM_DIR.

@tayloramurphy
Copy link
Collaborator

@aaronsteers since we have MELTANO_PROJECT_ROOT I like keeping "root" in the name. My preference would either be MELTANO_TEMP_DIR_ROOT or MELTANO_SYS_DIR_ROOT. I'm lean towards the first once since it makes it seems less "stable" and would encourage folks to rely on it less. But I'll leave it to you to make the final call now that my preference is voiced 👍

@pandemicsyn
Copy link
Contributor

@aaronsteers since we have MELTANO_PROJECT_ROOT I like keeping "root" in the name. My preference would either be MELTANO_TEMP_DIR_ROOT or MELTANO_SYS_DIR_ROOT

I'm a no-go on MELTANO_TEMP_DIR_ROOT because I think that could lead contributors (either to meltano, or via extensions) to use it as unstructured dumping ground like they would an actual temp dir. That might be ok, but feel like that should be something more intentional, like $MELTANO_TMP_DIR_ROOT=$MELTANO_PROJECT_ROOT/.meltano/temp.

Beyond that I have no prefs on naming. Sounds like MELTANO_SYS_DIR_ROOT is the defacto choice then ?

@rawwar
Copy link
Contributor Author

rawwar commented Aug 15, 2022

The only reason I raised it as a Draft PR was that I KNEW that variable name was bad 😄 . I'll use MELTANO_SYS_DIR_ROOT and update the code. Also, do I include it as a $MELTANO_SYS_DIR_ROOT/.meltano or consider $MELTANO_SYS_DIR_ROOT as a replacement to .meltano?

@pandemicsyn
Copy link
Contributor

The only reason I raised it as a Draft PR was that I KNEW that variable name was bad 😄 . I'll use MELTANO_SYS_DIR_ROOT and update the code. Also, do I include it as a $MELTANO_SYS_DIR_ROOT/.meltano or consider $MELTANO_SYS_DIR_ROOT as a replacement to .meltano?

My vote would be to consider $MELTANO_SYS_DIR_ROOT as a replacement to .meltano. So that the default is implied to be the equivalent of:

MELTANO_SYS_DIR_ROOT=$MELTANO_PROJECT_ROOT/.meltano

but you could override it too:

MELTANO_SYS_DIR_ROOT=/some/custom/.meltano
MELTANO_SYS_DIR_ROOT=/mnt/persistent0
...

@aaronsteers @tayloramurphy curious though if y'all where thinking about it the same way.

src/meltano/core/project.py Outdated Show resolved Hide resolved
Co-authored-by: Edgar R. M. <edgarrm358@gmail.com>
@aaronsteers
Copy link
Contributor

aaronsteers commented Aug 16, 2022

My vote would be to consider $MELTANO_SYS_DIR_ROOT as a replacement to .meltano.

So that the default is implied to be the equivalent of:

MELTANO_SYS_DIR_ROOT=$MELTANO_PROJECT_ROOT/.meltano

but you could override it too:

MELTANO_SYS_DIR_ROOT=/some/custom/.meltano\nMELTANO_SYS_DIR_ROOT=/mnt/persistent0

👍👍

@rawwar
Copy link
Contributor Author

rawwar commented Aug 16, 2022

I'm a bit stuck on how to correct settings.yml.
If we want to use $MELTANO_SYS_DIR_ROOT as a replacement to .meltano, what do I update the following to?

- name: database_uri
  value: sqlite:///$MELTANO_PROJECT_ROOT/.meltano/meltano.db
  env_specific: true

If $MELTANO_SYS_DIR_ROOT is defined, we no longer need .meltano and if it is not defined, .meltano needs to be included. I feel that I can update db.py to decide what to use based on the availability of the $MELTANO_SYS_DIR_ROOT env variable. If env variable doesn't exist, i go by the default value, else, i use the $MELTANO_SYS_DIR_ROOT/.meltano.db.
But, We also have $DATABASE_URI, which will just override the default settings

src/meltano/core/bundle/settings.yml Outdated Show resolved Hide resolved
src/meltano/core/project.py Outdated Show resolved Hide resolved
@edgarrmondragon
Copy link
Collaborator

I'm a bit stuck on how to correct settings.yml. If we want to use $MELTANO_SYS_DIR_ROOT as a replacement to .meltano, what do I update the following to?

- name: database_uri
  value: sqlite:///$MELTANO_PROJECT_ROOT/.meltano/meltano.db
  env_specific: true

If $MELTANO_SYS_DIR_ROOT is defined, we no longer need .meltano and if it is not defined, .meltano needs to be included. I feel that I can update db.py to decide what to use based on the availability of the $MELTANO_SYS_DIR_ROOT env variable. If env variable doesn't exist, i go by the default value, else, i use the $MELTANO_SYS_DIR_ROOT/.meltano.db. But, We also have $DATABASE_URI, which will just override the default settings

@rawwar with the changes I suggested, the default dot meltano dir would live in <project root>/.meltano, so the default sqlite db would still live in sqlite:///<project root>/.meltano/meltano.db

@rawwar
Copy link
Contributor Author

rawwar commented Aug 16, 2022

@rawwar with the changes I suggested, the default dot meltano dir would live in <project root>/.meltano, so the default sqlite db would still live in sqlite:///<project root>/.meltano/meltano.db

But, From the above discussion, we want to consider DOT_MELTANO_ROOT(AKA $MELTANO_SYS_DIR_ROOT) will be representing .meltano folder itself. So, If I do keep things as it is in settings.yml, I need to update in db.py to consider $MELTANO_SYS_DIR_ROOT/meltano.db as the db uri.

@pandemicsyn
Copy link
Contributor

@rawwar with the changes I suggested, the default dot meltano dir would live in <project root>/.meltano, so the default sqlite db would still live in sqlite:///<project root>/.meltano/meltano.db

But, From the above discussion, we want to consider DOT_MELTANO_ROOT(AKA $MELTANO_SYS_DIR_ROOT) will be representing .meltano folder itself. So, If I do keep things as it is in settings.yml, I need to update in db.py to consider $MELTANO_SYS_DIR_ROOT/meltano.db as the db uri.

@rawwar @edgarrmondragon yea wouldn't the default entry just become:

- name: database_uri
  value: sqlite:///$MELTANO_SYS_DIR_ROOT/meltano.db
  env_specific: true

Which by default then resolves $MELTANO_SYS_DIR_ROOT to $MELTANO_PROJECT_ROOT/.meltano ?

@rawwar
Copy link
Contributor Author

rawwar commented Aug 17, 2022

Correct me If I am wrong.
so, if we keep default as the following

- name: database_uri
  value: sqlite:///$MELTANO_SYS_DIR_ROOT/meltano.db
  env_specific: true

when user sets $MELTANO_PROJECT_ROOT to /home/my_project and does not set $MELTANO_SYS_DIR_ROOT. default value is evaluated to sqlite:////meltano.db . This would mean that meltano.db is created in the folder where user is running the command.
One way I can think of is updating env property of Project class to include MELTANO_SYS_DIR_ROOT. So, i updated it as follows.

@property
    def env(self):
        """Get environment variables for this project.

        Returns:
            dict of environment variables and values for this project.
        """
        environment_name = (
            self.active_environment.name if self.active_environment else ""
        )
        return {
            PROJECT_ROOT_ENV: str(self.root),
            PROJECT_ENVIRONMENT_ENV: environment_name,
            PROJECT_SYS_DIR_ROOT: str(self.sys_dir_root)
        }

@edgarrmondragon
Copy link
Collaborator

One way I can think of is updating env property of Project class to include MELTANO_SYS_DIR_ROOT. So, i updated it as follows.

@rawwar Yes, you're right! I think that's the correct approach.

@rawwar rawwar marked this pull request as ready for review August 17, 2022 21:11
@rawwar rawwar requested review from edgarrmondragon and pandemicsyn and removed request for edgarrmondragon and pandemicsyn August 17, 2022 21:12
@rawwar
Copy link
Contributor Author

rawwar commented Aug 17, 2022

I just realized while trying to write tests for #6627 that I need to write tests for this as well. I might take a while for writing tests.

Copy link
Collaborator

@tayloramurphy tayloramurphy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One suggested change, but otherwise approving from a docs perspective.

docs/src/_concepts/project.md Outdated Show resolved Hide resolved
Copy link
Contributor

@aaronsteers aaronsteers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ 👍

@edgarrmondragon
Copy link
Collaborator

This would also mean, we need to decide which takes precedence over others.

@rawwar I think it makes sense for the $MELTANO_SYS_DIR_ROOT to take precedence if it's present in the environment. Wdyt?

@rawwar
Copy link
Contributor Author

rawwar commented Sep 1, 2022

@rawwar I think it makes sense for the $MELTANO_SYS_DIR_ROOT to take precedence if it's present in the environment. Wdyt?

I have been trying to implement this. Was having trouble doing it properly.
Issue is that, my current implementation is failing for project init and working for other commands. Since, during init, there won't be meltano.yml, but, my code was trying to read it .. is causing issues. trying to handle this case.

Edit1: Giving more info
So, order of evaluation when meltano.yml already exists is $MELTANO_SYS_DIR_ROOT, meltano.sys_dir_root and then default path which is $MELTANO_PROJECT_ROOT/.meltano
But, for init command, it should be $MELTANO_SYS_DIR_ROOT, $MELTANO_PROJECT_ROOT/.meltano .

@aaronsteers
Copy link
Contributor

aaronsteers commented Sep 1, 2022

This would also mean, we need to decide which takes precedence over others.

@rawwar I think it makes sense for the $MELTANO_SYS_DIR_ROOT to take precedence if it's present in the environment. Wdyt?

As of now, I don't think we have a formal dev standard for this. But the above is consistent with what we might adopt as a standard if we take this as reference:

From https://clig.dev/#configuration:

Apply configuration parameters in order of precedence. Here is the precedence for config parameters, from highest to lowest:

  • Flags
  • The running shell’s environment variables
  • Project-level configuration (eg. .env)
  • User-level configuration
  • System wide configuration

cc @pandemicsyn, who is taking lead on CLI dev standards:

@pandemicsyn
Copy link
Contributor

As of now, I don't think we have a formal dev standard for this. But the above is consistent with what we might adopt as a standard if we take this as reference:

+1 we've not codified anything as far a control layers but pretty sure when it comes to a hierarchy what @aaronsteers referenced will almost certainly be our default stance.

Since, during init, there won't be meltano.yml, but, my code was trying to read it .. is causing issues. trying to handle this case.

@rawwar did you manage to work around this, or is this still a sticking point for ya?

@rawwar
Copy link
Contributor Author

rawwar commented Sep 15, 2022

As of now, I don't think we have a formal dev standard for this. But the above is consistent with what we might adopt as a standard if we take this as reference:

+1 we've not codified anything as far a control layers but pretty sure when it comes to a hierarchy what @aaronsteers referenced will almost certainly be our default stance.

Since, during init, there won't be meltano.yml, but, my code was trying to read it .. is causing issues. trying to handle this case.

@rawwar did you manage to work around this, or is this still a sticking point for ya?

I did try a week ago. But, I could not. I was stuck at it. I'll be looking into it today. I'll give an update here once I'm done.

@sbalnojan
Copy link
Contributor

Hey @rawwar, do you need need help to push this over the finish line?

@rawwar
Copy link
Contributor Author

rawwar commented Oct 14, 2022

Hey @rawwar, do you need need help to push this over the finish line?

I had some personal things going on. I will try to look into it this weekend. If I can't complete it, I do need help. Thank you so much!

@rawwar
Copy link
Contributor Author

rawwar commented Oct 19, 2022

Hey @rawwar, do you need need help to push this over the finish line?

I tried to resolve the issue since weekend and I am unable to. Would appreciate if I can get some help with this

@edgarrmondragon
Copy link
Collaborator

@rawwar I'll take a look

@edgarrmondragon edgarrmondragon force-pushed the kalyan/dot_meltano branch 2 times, most recently from d4e1db1 to feda7e9 Compare October 19, 2022 21:59
Copy link
Collaborator

@edgarrmondragon edgarrmondragon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rawwar ok, I got the merge conflicts solved after a few attempts 😅.

This is looking good. Pinging @aaronsteers in case I miss anything.

@aaronsteers aaronsteers changed the title feat: Make .meltano folder path configurable feat: Make .meltano folder path configurable via new MELTANO_SYS_DIR_ROOT env var Oct 19, 2022
Copy link
Contributor

@aaronsteers aaronsteers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@edgarrmondragon - Looks good from my side. Just one question here: https://github.com/meltano/meltano/pull/6628/files#r1000023143 [Resolved!]

If all other issues/questions resolved, do you want to do the honors with a merge to main?

@edgarrmondragon edgarrmondragon merged commit 91a349d into meltano:main Oct 20, 2022
@rawwar
Copy link
Contributor Author

rawwar commented Oct 20, 2022

@edgarrmondragon .. Thank you so much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feature: Make .meltano path configurable
7 participants