Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add sqlfluff CLI option to pass the configuration files #1244

Closed
Arshaku opened this issue Aug 4, 2021 · 12 comments · Fixed by #1986
Closed

add sqlfluff CLI option to pass the configuration files #1244

Arshaku opened this issue Aug 4, 2021 · 12 comments · Fixed by #1986
Labels
config options Additional configuration options whether in CLI or config file enhancement New feature or request

Comments

@Arshaku
Copy link

Arshaku commented Aug 4, 2021

I would like to have possibility to pass sqlfluff configuration files via CLI option (instead of current predefined list of directories to look up for config files) with similar inheritance/override logic we have now.

The use case for which I need it:
I have multiple python projects in my company, and they all use a python module I've created which contains all the common configuration and functionality used by all my projects. So now I want to put most the sqlfluff configuration (like the rule configurations, excluded rules, etc) in the shared module and inherit from there, as those configs will be the same for all my projects, but for example, jinja context variables are project-specific, so I don't want to put them in the config file in the shared python module. I want to have those jinja context variables per-project and have them within the project directory.
And when running the sqlfluff command from the project, I can for example list 2 config files - one from the project and one from the installed python module.

Does this make sense?

@Arshaku Arshaku added the enhancement New feature or request label Aug 4, 2021
@Arshaku
Copy link
Author

Arshaku commented Aug 6, 2021

another use case where this feature would be helpful:
In some of my sql files with jinja templates, I have boolean Jinja context variable used something like this:

{% if someBooleanVariable %}
... some big sql chunk here...
{% else %}
... some other big sql chunk here
 {% endif %}
... another big sql chunk here

with current solution in the .sqlfluff configuration file I can set only one value for "someBooleanVariable" leaving big section of my sql without linting.

if I would be able to pass list of configuration files to sqlfluff CLI, I could use it like this:
sqlfluff lint test.sql -config_files ./config_bool_false.toml,./shared_config.toml
sqlfluff lint test.sql -config_files ./config_bool_true.toml,./shared_config.toml

where:
shared_config.toml contains all the shared configuration,
config_bool_false.toml sets the "someBooleanVariable" jinja template variable value to False,
config_bool_true.toml sets the "someBooleanVariable" jinja template variable value to True,

This would allow me to run sqlfluff linting twice and cover all the sql queries I have.

@tunetheweb
Copy link
Member

This would also be useful with the addition of SQL Fluff to the GitHub Super-Linter, as typically those linter files exist within the .github/linters/ directory, but instead for now need to be in the repo root directory (or a sub directory of that containing all the SQL files).

However, slightly different to @Arshaku 's request, I'd prefer to still be able to override at a directory level. So for example most of my queries are BigQuery dialect (so that would go in the default config file specified by command line), but one directory has MySQL code, so would prefer it's own config file.

@jpy-git
Copy link
Contributor

jpy-git commented Nov 24, 2021

@tunetheweb @barrywhart I'd be interested in adding this as I also use both MySQL and Snowflake in the same project.

Will add a --config-file option to the CLI and config_file option to the simple API (#1274). These will default to None, in which case we discover config files via the usual method. If this parameter is specified with a pathlike object I see there being two approaches.

Currently we load and overwrite the config in the following order:

  • default_config.cfg (package default)
  • setup.cfg
  • tox.ini
  • pep8.ini
  • .sqlfluff
  • pyproject.toml
  • Any command line arguments

Option 1:
The user has specified a config file therefore they should only expect configs from that file to be applied over the defaults (with extra command line args overwriting)
i.e.

  • default_config.cfg (package default)
  • <custom_config_file>
  • Any command line arguments

Option 2:
The user is expecting that the custom config file is applied on top of the existing config files. Motivation might be that they have a main SQL style guide setup in .sqlfluff but they have some files that are in a different dialect (and maybe some rules in .sqlfluff config don't apply to this other dialect), so they just want the extra config to be very minimal and have the majority of the custom config inherited from .sqlfluff e.g.

[sqlfluff]
dialect = mysql
exclude_rules = L042

i.e.

  • default_config.cfg (package default)
  • setup.cfg
  • tox.ini
  • pep8.ini
  • .sqlfluff
  • pyproject.toml
  • <custom_config_file>
  • Any command line arguments

Personally I would lean towards Option 1 as it's a lot more obvious which settings will be in use, but wanted to get a second opinion in case anyone felt strongly the other way?

@barrywhart
Copy link
Member

barrywhart commented Nov 24, 2021

If I read the original issue description correctly, aren't they asking for option 2?

pass sqlfluff configuration files via CLI option ... with similar inheritance/override logic we have now.

I want to put most the sqlfluff configuration (like the rule configurations, excluded rules, etc) in the shared module and inherit from there... [but] jinja context variables are project-specific ... I want to have those jinja context variables per-project and have them within the project directory

@jpy-git
Copy link
Contributor

jpy-git commented Nov 24, 2021

@barrywhart actually reading that back the issue is asking for option 3:

  • default_config.cfg (package default)
  • <custom_config_file> (global user default)
  • setup.cfg
  • tox.ini
  • pep8.ini
  • .sqlfluff
  • pyproject.toml
  • Any command line arguments

basically there are multiple separate projects with variations on a default config that is contained in a separate project.
I think this is slightly different to the other options above as it inserts the custom config at the start of the standard configs.

I think it would doable but it makes sense for that to be a separate --default-config-file option.

I think Options 1 and 2 are more suitable for the issue of requiring multiple configs within a single project?

So propose option 3 for this issue and then one of 1 or 2.

@jpy-git
Copy link
Contributor

jpy-git commented Nov 24, 2021

I'd still lean to 1 for in multiple in project configurations

@tunetheweb
Copy link
Member

I don't understand the reason for allowing overrides. If you have the ability to place a .sqlfluff config file, then you should place one in the top level.

To me the ability to provide a config file is most useful for those that want to store the config file separately for whatever reason. Therefore it should not require overrides IMHO.

On the other hand, if an override file exists maybe it should be used as could be confusing if it's not used 🤔
Or maybe there should be an optional force option to force it to use the supplied one? Not necessarily as part of the initial PR, but maybe as a future iteration.

But to be honest, while I wanted this initially (mostly cause that's how the GitHub Super Linter works for other linters), I personally have gone away from needing this and just got use to having it in the folder structure so less of a priority to me.

I do think it would be useful for the simple API thought as discussed in #1274 and in more detail in #1419 (comment)

@barrywhart
Copy link
Member

In the case they describe, does the exact order matter? IIUC, they're saying they'd define the high-level config in <custom_config_file> and the Jinja variables in the project-specific files. As long as the config files do not touch the same settings, order doesn't matter.

I agree that 1 is good for the case you describe, so I'm in favor of 1 as well as 2 or 3.

Think it's worth asking (possibly creating a poll) in the SQLFluff Slack, to get more input? More than once, I've learned the hard way that dbt users are almost a separate species of user. 🤣

@jpy-git
Copy link
Contributor

jpy-git commented Nov 24, 2021

To me the ability to provide a config file is most useful for those that want to store the config file separately for whatever reason. Therefore it should not require overrides IMHO.

@tunetheweb that was my thought process for 1 too 👍

Think it's worth asking (possibly creating a poll) in the SQLFluff Slack, to get more input?

@barrywhart I'll create a poll in the slack 👍 Let's keep it simple and say order doesn't matter then so between 1 or 2

@barrywhart
Copy link
Member

barrywhart commented Nov 24, 2021

Overrides won't be useful for most people, I think. One plausible scenario I can imagine is: you have a large (probably dbt) project written in various styles and levels of SQLFluff compliance. You're gradually cleaning it all up, but it's taking weeks or months, and you want to disable certain rules or use more relaxed settings for directories that have yet to be cleaned up. (I think @alanmcruickshank went through a process like this in the early days at Tails. (For more context, see the team rollout guide here.)

NOTE: In this comment, I'm just trying to provide context, not argue for a particular decision.

@aranke
Copy link

aranke commented Nov 24, 2021

Overrides would be helpful to fix this issue I opened a while back where SQLFluff picks up the wrong dbt config: #1468.

@jpy-git
Copy link
Contributor

jpy-git commented Nov 24, 2021

I guess the nice thing about overrides is that even if someone doesn't want to use them they can just not have sqlfluff config files in the current directory and keep their various sqlfluff configs in a separate folder that they reference via the CLI/API to get the same independent functionality of Option 1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
config options Additional configuration options whether in CLI or config file enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants