Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS open search compatibility (should be quick to fix) #1865

Open
ajwillo opened this issue Dec 20, 2022 · 1 comment
Open

AWS open search compatibility (should be quick to fix) #1865

ajwillo opened this issue Dec 20, 2022 · 1 comment

Comments

@ajwillo
Copy link

ajwillo commented Dec 20, 2022

  • [ X] Tested with the current Haystack master branch

Expected behaviour

index is built

Actual behaviour

index is not built due to incompatibility with open search

Steps to reproduce the behaviour

try use AWS open search (which is elastic compatible with haystacks)

Configuration

  • Operating system version: docker python
  • Search engine version: AWS opensearch
  • Python version: 3.10.1
  • Django version: 4.1.3
  • Haystack version: 3.2.1

Im trying to use AWS open search with Django Haystacks, which is fully compatible (Amazon have suggested). However when I try launch "./manage.py rebuild_index" when pointing a connection at the platform I am met with the error

    Traceback (most recent call last):
      File "/data/app/myapp/./manage.py", line 22, in <module>
        execute_from_command_line(sys.argv)
      File "/usr/local/lib/python3.10/site-packages/django/core/management/__init__.py", line 446, in execute_from_command_line
        utility.execute()
      File "/usr/local/lib/python3.10/site-packages/django/core/management/__init__.py", line 440, in execute
        self.fetch_command(subcommand).run_from_argv(self.argv)
      File "/usr/local/lib/python3.10/site-packages/django/core/management/base.py", line 402, in run_from_argv
        self.execute(*args, **cmd_options)
      File "/usr/local/lib/python3.10/site-packages/django/core/management/base.py", line 448, in execute
        output = self.handle(*args, **options)
      File "/usr/local/lib/python3.10/site-packages/haystack/management/commands/rebuild_index.py", line 64, in handle
        call_command("clear_index", **clear_options)
      File "/usr/local/lib/python3.10/site-packages/django/core/management/__init__.py", line 198, in call_command
        return command.execute(*args, **defaults)
      File "/usr/local/lib/python3.10/site-packages/django/core/management/base.py", line 448, in execute
        output = self.handle(*args, **options)
      File "/usr/local/lib/python3.10/site-packages/haystack/management/commands/clear_index.py", line 64, in handle
        backend.clear(commit=self.commit)
      File "/usr/local/lib/python3.10/site-packages/haystack/backends/elasticsearch7_backend.py", line 117, in clear
        self.conn.indices.delete(index=self.index_name, ignore=404)
      File "/usr/local/lib/python3.10/site-packages/elasticsearch/client/utils.py", line 347, in _wrapped
        return func(*args, params=params, headers=headers, **kwargs)
      File "/usr/local/lib/python3.10/site-packages/elasticsearch/client/indices.py", line 334, in delete
        return self.transport.perform_request(
      File "/usr/local/lib/python3.10/site-packages/elasticsearch/transport.py", line 421, in perform_request
        _ProductChecker.raise_error(self._verified_elasticsearch)
      File "/usr/local/lib/python3.10/site-packages/elasticsearch/transport.py", line 638, in raise_error
        raise UnsupportedProductError(message)
    elasticsearch.exceptions.UnsupportedProductError: The client noticed that the server is not a supported distribution of Elasticsearch

when we've dug into this by looking in "site-packages/elasticsearch/transport.py" and the differences between the amazon headers and an elastic search header the only differences between the two is that amazon openserach "build_flavour" in it. Build flavour is used in the check_product function from transport.py

    @classmethod
        def check_product(cls, headers, response):
            # type: (dict[str, str], dict[str, str]) -> int
            """Verifies that the server we're talking to is Elasticsearch.
            Does this by checking HTTP headers and the deserialized
            response to the 'info' API. Returns one of the states above.
            """
        try:
            version = response.get("version", {})
            version_number = tuple(
                int(x) if x is not None else 999
                for x in re.search(
                    r"^([0-9]+)\.([0-9]+)(?:\.([0-9]+))?", version["number"]
                ).groups()
            )
        except (KeyError, TypeError, ValueError, AttributeError):
            # No valid 'version.number' field, effectively 0.0.0
            version = {}
            version_number = (0, 0, 0)

        # Check all of the fields and headers for missing/valid values.
        try:
            bad_tagline = response.get("tagline", None) != "You Know, for Search"
            bad_build_flavor = version.get("build_flavor", None) != "default"
            bad_product_header = (
                headers.get("x-elastic-product", None) != "Elasticsearch"
            )
        except (AttributeError, TypeError):
            bad_tagline = True
            bad_build_flavor = True
            bad_product_header = True

        # 7.0-7.13 and there's a bad 'tagline' or unsupported 'build_flavor'
        if (7, 0, 0) <= version_number < (7, 14, 0):
            if bad_tagline:
                return cls.UNSUPPORTED_PRODUCT
            elif bad_build_flavor:
                return cls.UNSUPPORTED_DISTRIBUTION

        elif (
            # No version or version less than 6.x
            version_number < (6, 0, 0)
            # 6.x and there's a bad 'tagline'
            or ((6, 0, 0) <= version_number < (7, 0, 0) and bad_tagline)
            # 7.14+ and there's a bad 'X-Elastic-Product' HTTP header
            or ((7, 14, 0) <= version_number and bad_product_header)
        ):
            return cls.UNSUPPORTED_PRODUCT

        return True

I commented out the build flavour checks in this function, and then the index built successfully! but ive noticed the file "elasticsearch/transport.py" and the code "def check_product" doesn't exist in this repo. I don't know where this is pulled from or how I can create a build that will comment out these lines

Thanks

@AnnaBNana
Copy link

@ajwillo tested according to specs listed below, and I am unable to reproduce your error. When I run ./manage.py rebuild_index I am able to successfully write to an OpenSearch index. Without more detail on how you configured your OpenSearch domain, it is hard to say what the problem is here. Here are the details on how my application is set up:

Django==4.1.7
django-haystack==3.2.1
elasticsearch==7.13.3
Python 3.11.0b5

settings.py

HAYSTACK_CONNECTIONS = {
    'default': {
        'ENGINE': 'haystack.backends.elasticsearch7_backend.Elasticsearch7SearchEngine',
        'URL': <DOMAIN-ENDPOINT>,
        'INDEX_NAME': 'haystack',
        'KWARGS': {
            'port': 443,
            'http_auth': (USERNAME, PASSWORD),
        }
    },
}

I configured my OpenSearch instance as a test domain, according to instructions found here https://docs.aws.amazon.com/opensearch-service/latest/developerguide/gsgcreate-domain.html

During setup I selected Enable compatibility mode, as it sounded like this might be required since I used version 7.x on the application side. My only thought is you may need to configure your OpenSearch domain differently to get this working. Hope this helps, don't hesitate to reach out, would be happy to help you troubleshoot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants