Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exception on /api/v1/orchestrations/pipeline-schedules AttributeError: 'NoneType' object has no attribute 'get' #6240

Closed
BuzzCutNorman opened this issue Jun 19, 2022 · 16 comments
Assignees

Comments

@BuzzCutNorman
Copy link
Contributor

BuzzCutNorman commented Jun 19, 2022

I get this error sometimes when strating meltano ui on a windows machine. The error also happens when I navigate between tabs in the UI. The error is associtated with the following API call in the log

Exception on /api/v1/orchestrations/pipeline-schedules [GET]

Here is the ending bit of the error

  File "C:\development\meltano\src\meltano\core\project_plugins_service.py", line 83, in current_plugins
    self._current_plugins = self.config_service.current_meltano_yml.plugins
  File "C:\development\meltano\src\meltano\core\config_service.py", line 35, in current_meltano_yml
    self._current_meltano_yml = self.project.meltano
  File "C:\development\meltano\src\meltano\core\project.py", line 230, in meltano
    return MeltanoFile.parse(self.project_files.load())
  File "C:\development\meltano\src\meltano\core\project_files.py", line 66, in load
    included_file_contents = self._load_included_files()
  File "C:\development\meltano\src\meltano\core\project_files.py", line 150, in _load_included_files
    for path in self.include_paths:
  File "C:\development\meltano\src\meltano\core\project_files.py", line 59, in include_paths
    include_path_patterns = self.meltano.get("include_paths", [])
AttributeError: 'NoneType' object has no attribute 'get'

It can't find the attribute get which I think is suppose to be a function of the class ? I also have not been able to determine what the function get is suppose to do. It seems like it is reading the meltano.yaml for the include_paths entries and return all the paths in a list. In the project_files.py there are these back to back function you can see there is a meltano function.

    @property
    def meltano(self):
        """Return the contents of this projects `meltano.yml`."""
        if self._meltano is None:
            with open(self._meltano_file_path) as melt_f:
                self._meltano = yaml.safe_load(melt_f)
        return self._meltano

   @property
    def include_paths(self) -> List[Path]:
        """Return list of paths derived from glob patterns defined in the meltanofile."""
        include_path_patterns = self.meltano.get("include_paths", [])
        return self._resolve_include_paths(include_path_patterns)

In project.py there is a meltano function that returns a MeltanoFile class

    @property
    def meltano(self) -> MeltanoFile:
        """Return a copy of the current meltano config.

        Returns:
            the current meltano config
        """
        with self._meltano_rw_lock.read_lock():
            return MeltanoFile.parse(self.project_files.load())

The MeltanoFile class that is in meltano_file.py has a get_plugins_for_mappings.

@tayloramurphy
Copy link
Collaborator

@kgpayne @alexmarple looping you two in here since this seems to be a possible bug with the UI and/or multi-file YAML.

@BuzzCutNorman can you share a bit about your Meltano setup too? Version of meltano and python version along with your meltano.yml file?

Also, cc @visch since this is Windows 😄

@BuzzCutNorman
Copy link
Contributor Author

@tayloramurphy sure thing. I have python 3.9.12 installed. Meltano was installed using pipx and is meltano, version 2.1.0 . I get the error message even when I start meltano ui in a freshly initialized test project with the default meltano.yml file.

cc @visch @kgpayne @alexmarple

@tayloramurphy
Copy link
Collaborator

@aaronsteers adding to Engineering Assignments so we can get some eyes on it then we can figure out when to prioritize the fix.

@aaronsteers
Copy link
Contributor

File "C:\development\meltano\src\meltano\core\project_files.py", line 59, in include_paths
include_path_patterns = self.meltano.get("include_paths", [])
AttributeError: 'NoneType' object has no attribute 'get'

This points to self.meltano being None which I can't immediately explain per se.

@cjohnhanson - Can you take a look?

@cjohnhanson
Copy link
Contributor

@BuzzCutNorman (cc: @aaronsteers) this may be a bit hard to fully debug without a windows machine to test on. But I was able to replicate the error and stacktrace by replacing my meltano.yml with an empty file. If the meltano.yml file isn't found in the project root, then meltano gives the informative "meltano ui must be run inside a Meltano project" error message, but if the meltano.yml is an empty file, there's no specific informative error that's thrown, it just parses the empty yaml into a None.

Do you get this same error when you try to run other commands in the same project where you get the error for the meltano ui command? E.g., if you get this error after running meltano ui if you then run meltano config in the same project, do you get the same error? If so, that might indicate that Meltano is finding an empty meltano.yml file in the directory that it considers to be the project root. Which could mean:

  1. The project is somehow getting initialized with an empty meltano.yml
  2. The meltano.yml file is somehow getting overwritten with a None object somewhere in the call stack
  3. Meltano considers the project root to be a directory which either has an empty meltano.yml file in it OR the behavior of with open(<meltano file path>) as melt_f: and then yaml.safe_load(melt_f) behaves differently on Windows than on Unix machines, such that an empty file is created in whatever directory Meltano considers to be the project root.

@BuzzCutNorman
Copy link
Contributor Author

BuzzCutNorman commented Jun 23, 2022

@cjohnhanson (cc: @aaronsteers ) I can verify that after running 'meltano init --no_usage_stats test-meltano the project did get a populated meltano.yml . This is the content of the test-meltano project's meltano.yml file after initialization:

version: 1
default_environment: dev
project_id: fcda620d-db88-4a3a-8c4f-aaf6b404e5a7
send_anonymous_usage_stats: false
environments:
- name: dev
- name: staging
- name: prod

I added a extractor tap-postgres to accomplish the the second requested test. The new meltano.yml looks like this.

version: 1
default_environment: dev
project_id: fcda620d-db88-4a3a-8c4f-aaf6b404e5a7
send_anonymous_usage_stats: false
plugins:
  extractors:
  - name: tap-postgres
    variant: transferwise
    pip_url: pipelinewise-tap-postgres
    config:
      dbname: datawarehouse
      default_replication_method: FULL_TABLE
      filter_schemas: raw
      user: username
environments:
- name: dev
- name: staging
- name: prod

I started the meltano ui and once it opened in my browser I navigated from extractor tab to loader tab to pipeline tab the back until I got the error. Once I got the error in another command prompt I ran the following command meltano config tap-postgres list which returned this output:

PS C:\development\test-meltano> meltano config tap-postgres list
2022-06-23T23:24:50.065452Z [info     ] Environment 'dev' is active
host [env: TAP_POSTGRES_HOST] current value: 'localhost' (default)
        Host: PostgreSQL host
port [env: TAP_POSTGRES_PORT] current value: 5432 (default)
        Port: PostgreSQL port
user [env: TAP_POSTGRES_USER] current value: 'username' (from `meltano.yml`)
        User: PostgreSQL user
password [env: TAP_POSTGRES_PASSWORD] current value: 'password' (from `.env`)
        Password: PostgreSQL password
dbname [env: TAP_POSTGRES_DBNAME] current value: 'datawarehouse' (from `meltano.yml`)
        Database Name: PostgreSQL database name
ssl [env: TAP_POSTGRES_SSL] current value: False (default)
        SSL: Using SSL via postgres `sslmode='require'` option. If the server does not accept SSL connections or the client certificate is not recognized the connection will fail
filter_schemas [env: TAP_POSTGRES_FILTER_SCHEMAS] current value: 'raw' (from `meltano.yml`)
        Filter Schemas: Scan only the specified comma-separated schemas to improve the performance of data extraction
default_replication_method [env: TAP_POSTGRES_DEFAULT_REPLICATION_METHOD] current value: 'FULL_TABLE' (from `meltano.yml`)
max_run_seconds [env: TAP_POSTGRES_MAX_RUN_SECONDS] current value: 43200 (default)
        Max Run Seconds: Stop running the tap after certain number of seconds
logical_poll_total_seconds [env: TAP_POSTGRES_LOGICAL_POLL_TOTAL_SECONDS] current value: 10800 (default)
        Logical Poll Total Seconds: Stop running the tap when no data received from wal after certain number of seconds
break_at_end_lsn [env: TAP_POSTGRES_BREAK_AT_END_LSN] current value: True (default)
        Break At End LSN: Stop running the tap if the newly received lsn is after the max lsn that was detected when the tap started

To learn more about extractor 'tap-postgres' and its settings, visit https://hub.meltano.com/extractors/tap-postgres--transferwise

I set --log-level debug when I started the UI here is the output from the console
buzzcutnorman-meltano-ui-console-none-type-20220623.log
and the meltano-ui.log file from the project.

@BuzzCutNorman
Copy link
Contributor Author

@cjohnhanson (cc: @aaronsteers ) Yesterday I created a local branch of Meltano and added the following logger to line 51 of project_files.py to assist in trouble shooting the third scenario you mentioned.

logger.debug("project_files meltano property used")

With the logger added I started a poetry shell from the Meltano directory and navigated to the test-meltano project folder and ran meltano --log-level debug ui . No matter what I do I can not get the error to happen. If I do a ctrl + c and stop the ui then comment out the logger from project_files.py save, then start meltano --log-level debug ui I can get the error to occur by navigating down the tabs then back up them.

@BuzzCutNorman
Copy link
Contributor Author

@cjohnhanson (cc: @aaronsteers ) Working more with this I found that the logger only helps if you have debug logging level in enabled. Once that is taken away you are right back to the error occurring. So the logger is just giving a slight amount of time for something to complete. Which lead me to this line of thinking. The error might occur when the ProjectFiles class is loaded and since during __init__ it sets

self._meltano = None

and the load() function which is being called does a reset_cache() which is really setting

self._meltano = None

happens really close together. The self._meltano variable is getting cleared right when it is needed. I tested this out by removing the logger and instead commented out line 65 in the project_files.py

#self.reset_cache()

and ran meltano ui. I am not able to make the error occur just like when the logger was in place. I think this might be an explanation as to why the logger was fixing the issue.

@cjohnhanson
Copy link
Contributor

@BuzzCutNorman that's a great find, thanks for diving in deeper here.

Just to 100% confirm -- you're only seeing this behavior with the meltano ui command, correct?

If so, I think we can safely assume that this is a race condition between the two worker threads that refer to the same underlying Project which are run by meltano ui.

As far as I can tell, we wrap references to theProject.project_files property in an appropriate lock such that it should be thread safe. But it looks like we don't require a lock within the body of the project_files property itself. I think a quick solution to this would be to just wrap the body of the Project.project_files property method within a context manager which acquires the relevant read lock. CC: @aaronsteers and @pandemicsyn in case you have more insight here.

I'm both surprised we've never had reports of this behavior before and that it happens so consistently on your system. I'll get a PR with the change put together and @BuzzCutNorman once I do it'd be great if you could double check that it resolves the issue on your Windows machine, since we're having trouble replicating the issue elsewhere.

@BuzzCutNorman
Copy link
Contributor Author

@cjohnhanson (cc: @aaronsteers , @pandemicsyn ) Yes, I only see this behavior when I am navigating the UI after starting it by calling meltano ui. I am more than happy to give it a test once you are ready. I have never tried to grab someone's PR and work with it locally so I will need to do some research on how to best do that.

@BuzzCutNorman
Copy link
Contributor Author

@cjohnhanson (cc: @aaronsteers , @pandemicsyn ) Ok I figured out on how I can test this. I have some results to share back. Is it best to put my comments here or put the comments in the PR ? I am guessing PR but just wanted to check.

@cjohnhanson
Copy link
Contributor

@BuzzCutNorman Glad you were able to test locally, I was working out how best to test on my end and was struggling to find a Windows machine that I could easily test on.

Go ahead and share results in the PR and we can continue the discussion there.

@visch
Copy link
Collaborator

visch commented Jun 27, 2022

I can verify that I get this on Meltano with the Uvicorn feature flag enabled. I can't replicate this on Linux.

@BuzzCutNorman
Copy link
Contributor Author

@visch I have concluded I am not the best at testing upstream PRs locally. When you get a chance please test the PR for this issue so Cody can get a trustworthy Windows machine result. Thank you in advance.

@pandemicsyn
Copy link
Contributor

pandemicsyn commented Jun 30, 2022

fyi - Took some grinding but was actually able to reproduce this race on my Mac as well, so not limited to Windows anything:

2022-06-30T18:15:10.409270Z [error    ] Exception on /api/v1/plugins/all [GET]
<!--- output trimmed  ---!>
    self._current_meltano_yml = self.project.meltano
  File "/Users/syn/projects/meltano/src/meltano/core/project.py", line 236, in meltano
    return MeltanoFile.parse(self.project_files.load())
  File "/Users/syn/projects/meltano/src/meltano/core/project_files.py", line 66, in load
    included_file_contents = self._load_included_files()
  File "/Users/syn/projects/meltano/src/meltano/core/project_files.py", line 150, in _load_included_files
    for path in self.include_paths:
  File "/Users/syn/projects/meltano/src/meltano/core/project_files.py", line 59, in include_paths
    include_path_patterns = self.meltano.get("include_paths", [])
AttributeError: 'NoneType' object has no attribute 'get'
2022-06-30T18:15:10.454269Z [info     ] Error: 500 Internal Server Error: The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.

The most consistent way for me was just using one of the http bench mark tools and just hitting a few of the API endpoints as fast as possible.

@visch
Copy link
Collaborator

visch commented Jul 1, 2022

@visch I have concluded I am not the best at testing upstream PRs locally. When you get a chance please test the PR for this issue so Cody can get a trustworthy Windows machine result. Thank you in advance.

There's no tests in the UI, don't blame yourself for not testing every code path! Tests are supposed to do that :D

The UI was atleast loading for me, then I changed something on my local machine and now I"m getting

Please run `make bundle` from src/webapp of the Meltano project. when I load the project... hmm not sure what I did

EDIT:

Figured this out.

If I build the project myself locally ie something like

pipx install .
meltano ui

Meltano wants me to build the webapp folder manually.

If instead I install meltano from pypi

pipx install meltnao
meltano ui

Life is good. Something must be bundled with the PyPi build (yarn build or something I don't don't)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants