Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AL 4.5.0.27: updater cannot upgrade any service #230

Closed
kam193 opened this issue May 28, 2024 · 15 comments
Closed

AL 4.5.0.27: updater cannot upgrade any service #230

kam193 opened this issue May 28, 2024 · 15 comments
Assignees
Labels
accepted This issue was accepted, we will work on this at some point bug Something isn't working core In progress

Comments

@kam193
Copy link

kam193 commented May 28, 2024

Describe the bug
After upgrading to 4.5.0.27, AssemblyLine isn't able to upgrade any service. However, the upgrade looks successful if I tried to start a container with the new service image manually. I see a couple of errors from starting containers (about ODM schema), but I'm unable to say if they come from the update container (I guess they are not - I see errors from update and privileged services, even if I try to upgrade a regular service). My suspicious is, that for some reason the updating thread is silently failing before even starting the container.

Have you seen anything like this, or should I try to add some additional logs/try-except in the updater container?

To Reproduce
Steps to reproduce the behavior:

  1. Upgrade AssemblyLine to v27
  2. Try o update any service.

This is an example log from the updater, here for Swiffer (I tried with a few different services):

{"@timestamp": "2024-05-28 20:27:47,142", "event": { "module": "assemblyline", "dataset": "assemblyline.service.updater" }, "host": { "ip": "172.16.0.7", "hostname": "feade7f10c47" }, "log": { "level": "INFO", "logger": "assemblyline.service.updater" }, "process": { "pid": "1" }, "message": "[CU] Service Swiffer is being updated to version 4.5.0.stable6..."}
{"@timestamp": "2024-05-28 20:27:47,143", "event": { "module": "assemblyline", "dataset": "assemblyline.service.updater" }, "host": { "ip": "172.16.0.7", "hostname": "feade7f10c47" }, "log": { "level": "INFO", "logger": "assemblyline.service.updater" }, "process": { "pid": "1" }, "message": "[CI] Installing all services marked for install..."}
{"@timestamp": "2024-05-28 20:27:47,146", "event": { "module": "assemblyline", "dataset": "assemblyline.service.updater" }, "host": { "ip": "172.16.0.7", "hostname": "feade7f10c47" }, "log": { "level": "INFO", "logger": "assemblyline.service.updater" }, "process": { "pid": "1" }, "message": "[CI] Done installing services, waiting 60 seconds for next install..."}
{"@timestamp": "2024-05-28 20:28:03,531", "event": { "module": "assemblyline", "dataset": "assemblyline.service.updater" }, "host": { "ip": "172.16.0.7", "hostname": "feade7f10c47" }, "log": { "level": "ERROR", "logger": "assemblyline.service.updater" }, "process": { "pid": "1" }, "message": "[CU] Service Swiffer has failed to update because resulting service key (Swiffer_4.5.0.6) does not exist. Update procedure cancelled..."}

If I tried a following command (for Sigma which also couldn't be updated by AL, & I reused the network created by AL for the Safelist):

docker run --env SERVICE_API_HOST=http://service-server:5003 --env SERVICE_API_KEY=***--network assemblyline_service-net-Safelist --name test_al <registry cache>/cccs/assemblyline-service-sigma:4.5.0.stable7 

It's successful and AL recognize the new version.

Expected behavior
Services are upgraded

Screenshots

Environment (please complete the following information if pertinent):

  • Assemblyline Version: 4.5.0.27, docker compose
  • Browser: [e.g. chrome, safari]

Additional context
Here an example from errors I see in logs, but I think they should be gone after service upgrade and are unrelated:

May 28 20:07:01 docker2 270d7f531c45[2288004]: Traceback (most recent call last):
May 28 20:07:01 docker2 270d7f531c45[2288004]:   File "<frozen runpy>", line 198, in _run_module_as_main
May 28 20:07:01 docker2 270d7f531c45[2288004]:   File "<frozen runpy>", line 88, in _run_code
May 28 20:07:01 docker2 270d7f531c45[2288004]:   File "/opt/al_service/badlist/update_server.py", line 19, in <module>
May 28 20:07:01 docker2 270d7f531c45[2288004]:     from assemblyline_v4_service.updater.updater import ServiceUpdater
May 28 20:07:01 docker2 270d7f531c45[2288004]:   File "/var/lib/assemblyline/.local/lib/python3.11/site-packages/assemblyline_v4_service/updater/updater.py", line 27, in <module>
May 28 20:07:01 docker2 270d7f531c45[2288004]:     from assemblyline_v4_service.common.base import SIGNATURES_META_FILENAME
May 28 20:07:01 docker2 270d7f531c45[2288004]:   File "/var/lib/assemblyline/.local/lib/python3.11/site-packages/assemblyline_v4_service/common/base.py", line 20, in <module>
May 28 20:07:01 docker2 270d7f531c45[2288004]:     from assemblyline_v4_service.common.api import PrivilegedServiceAPI, ServiceAPI
May 28 20:07:01 docker2 270d7f531c45[2288004]:   File "/var/lib/assemblyline/.local/lib/python3.11/site-packages/assemblyline_v4_service/common/api.py", line 5, in <module>
May 28 20:07:01 docker2 270d7f531c45[2288004]:     from assemblyline_core.badlist_client import BadlistClient
May 28 20:07:01 docker2 270d7f531c45[2288004]:   File "/var/lib/assemblyline/.local/lib/python3.11/site-packages/assemblyline_core/badlist_client.py", line 7, in <module>
May 28 20:07:01 docker2 270d7f531c45[2288004]:     from assemblyline.datastore.helper import AssemblylineDatastore
May 28 20:07:01 docker2 270d7f531c45[2288004]:   File "/var/lib/assemblyline/.local/lib/python3.11/site-packages/assemblyline/datastore/helper.py", line 29, in <module>
May 28 20:07:01 docker2 270d7f531c45[2288004]:     from assemblyline.odm.models.result import Result
May 28 20:07:01 docker2 270d7f531c45[2288004]:   File "/var/lib/assemblyline/.local/lib/python3.11/site-packages/assemblyline/odm/models/result.py", line 32, in <module>
May 28 20:07:01 docker2 270d7f531c45[2288004]:     constants = forge.get_constants()
May 28 20:07:01 docker2 270d7f531c45[2288004]:                 ^^^^^^^^^^^^^^^^^^^^^
May 28 20:07:01 docker2 270d7f531c45[2288004]:   File "/var/lib/assemblyline/.local/lib/python3.11/site-packages/assemblyline/common/forge.py", line 123, in get_constants
May 28 20:07:01 docker2 270d7f531c45[2288004]:     config = get_config()
May 28 20:07:01 docker2 270d7f531c45[2288004]:              ^^^^^^^^^^^^
May 28 20:07:01 docker2 270d7f531c45[2288004]:   File "/var/lib/assemblyline/.local/lib/python3.11/site-packages/assemblyline/common/forge.py", line 117, in get_config
May 28 20:07:01 docker2 270d7f531c45[2288004]:     config_cache[yml_config] = _get_config(yml_config=yml_config)
May 28 20:07:01 docker2 270d7f531c45[2288004]:                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
May 28 20:07:01 docker2 270d7f531c45[2288004]:   File "/var/lib/assemblyline/.local/lib/python3.11/site-packages/assemblyline/common/forge.py", line 94, in _get_config
May 28 20:07:01 docker2 270d7f531c45[2288004]:     yml_data = yaml.safe_load(env_substitute(yml_fh.read()))
May 28 20:07:01 docker2 270d7f531c45[2288004]:                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
May 28 20:07:01 docker2 270d7f531c45[2288004]:   File "/var/lib/assemblyline/.local/lib/python3.11/site-packages/yaml/__init__.py", line 125, in safe_load
May 28 20:07:01 docker2 270d7f531c45[2288004]:     return load(stream, SafeLoader)
May 28 20:07:01 docker2 270d7f531c45[2288004]:            ^^^^^^^^^^^^^^^^^^^^^^^^
May 28 20:07:01 docker2 270d7f531c45[2288004]:   File "/var/lib/assemblyline/.local/lib/python3.11/site-packages/yaml/__init__.py", line 81, in load
May 28 20:07:01 docker2 270d7f531c45[2288004]:     return loader.get_single_data()
May 28 20:07:01 docker2 270d7f531c45[2288004]:            ^^^^^^^^^^^^^^^^^^^^^^^^
May 28 20:07:01 docker2 270d7f531c45[2288004]:   File "/var/lib/assemblyline/.local/lib/python3.11/site-packages/yaml/constructor.py", line 51, in get_single_data
May 28 20:07:01 docker2 270d7f531c45[2288004]:     return self.construct_document(node)
May 28 20:07:01 docker2 270d7f531c45[2288004]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
May 28 20:07:01 docker2 270d7f531c45[2288004]:   File "/var/lib/assemblyline/.local/lib/python3.11/site-packages/yaml/constructor.py", line 60, in construct_document
May 28 20:07:01 docker2 270d7f531c45[2288004]:     for dummy in generator:
May 28 20:07:01 docker2 270d7f531c45[2288004]:   File "/var/lib/assemblyline/.local/lib/python3.11/site-packages/yaml/constructor.py", line 413, in construct_yaml_map
May 28 20:07:01 docker2 270d7f531c45[2288004]:     value = self.construct_mapping(node)
May 28 20:07:01 docker2 270d7f531c45[2288004]:             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
May 28 20:07:01 docker2 270d7f531c45[2288004]:   File "/var/lib/assemblyline/.local/lib/python3.11/site-packages/yaml/constructor.py", line 218, in construct_mapping
May 28 20:07:01 docker2 270d7f531c45[2288004]:     return super().construct_mapping(node, deep=deep)
May 28 20:07:01 docker2 270d7f531c45[2288004]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
May 28 20:07:01 docker2 270d7f531c45[2288004]:   File "/var/lib/assemblyline/.local/lib/python3.11/site-packages/yaml/constructor.py", line 143, in construct_mapping
May 28 20:07:01 docker2 270d7f531c45[2288004]:     value = self.construct_object(value_node, deep=deep)
May 28 20:07:01 docker2 270d7f531c45[2288004]:             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
May 28 20:07:01 docker2 270d7f531c45[2288004]:   File "/var/lib/assemblyline/.local/lib/python3.11/site-packages/yaml/constructor.py", line 100, in construct_object
May 28 20:07:01 docker2 270d7f531c45[2288004]:     data = constructor(self, node)
May 28 20:07:01 docker2 270d7f531c45[2288004]:            ^^^^^^^^^^^^^^^^^^^^^^^
May 28 20:07:01 docker2 270d7f531c45[2288004]:   File "/var/lib/assemblyline/.local/lib/python3.11/site-packages/yaml/constructor.py", line 427, in construct_undefined
May 28 20:07:01 docker2 270d7f531c45[2288004]:     raise ConstructorError(None, None,
May 28 20:07:01 docker2 270d7f531c45[2288004]: yaml.constructor.ConstructorError: could not determine a constructor for the tag 'tag:yaml.org,2002:python/object:assemblyline.odm.base.TypedMapping'
May 28 20:07:01 docker2 270d7f531c45[2288004]:   in "<unicode string>", line 451, column 15:
May 28 20:07:01 docker2 270d7f531c45[2288004]:           INGEST: !!python/object:assemblyline.odm ... 
May 28 20:07:01 docker2 270d7f531c45[2288004]:                   ^
@kam193 kam193 added assess We still haven't decided if this will be worked on or not bug Something isn't working labels May 28, 2024
@cccs-rs
Copy link
Contributor

cccs-rs commented May 29, 2024

Hmm... I wonder if it has anything to do with:
https://github.com/CybercentreCanada/assemblyline-base/blob/90ba51e38c3e0e3bf01c8b1f4541902a89208ed0/assemblyline/odm/models/config.py#L1903

At least based on the last line of the error and that was the only new thing added to the Config ODM in the new release. Surprised I haven't seen this..

@cccs-rs
Copy link
Contributor

cccs-rs commented May 29, 2024

Could it be perhaps something related to your configuration? I just checked our logging stack and found no mention of the error, so I suspect it could be a parsing error for the metadata enforcement configuration that's causing the error to be raised.

I would suspect to see this error in the other core containers as well since they should all call forge.get_config()

@kam193
Copy link
Author

kam193 commented May 29, 2024

It may be, I have indeed tried to configure it, and for some reasons I didn't try the upgrade without configuring metadata. But the core containers didn't have any such problem, just privileged services and update containers. Anyway, I'll check it and come back, thanks for the tip

@cccs-rs cccs-rs self-assigned this May 29, 2024
@kam193
Copy link
Author

kam193 commented May 29, 2024

The current state of debugging:

  1. I have removed the submission key from the config at all & re-updated AL, tried to update any service - no success.
  2. As so, I started adding logs to the updated, and I ended up with information from the L131 (https://github.com/CybercentreCanada/assemblyline-core/blame/167dd89cbc2299321b4c3a13abde9a73823abab5/assemblyline_core/updater/run_updater.py#L131) that container excited with an unsuccessful status code.
  3. After getting the ID of the launched container, I managed to connect issues from logs with it, and this indeed the issue from ODM.

Now I'm looking into what exactly causes the failure.

@kam193
Copy link
Author

kam193 commented May 29, 2024

And we have the winner - what is, indeed, the config for metadata. However, this is not what core services see. They see the original configuration I wrote, but all privileged services, including the container for updating, gets the configuration prepared by the scaler. It looks like the scaler doesn't understand dumping it well - here is what it produces:

 max_temp_data_length: 4096
  metadata:
    archive: {}
    ingest:
      INGEST: !!python/object:assemblyline.odm.base.TypedMapping     <----- Line 451, as in the traceback
        index: false
        sanitizer: !!python/object/apply:re._compile
        - ^[A-Za-z0-9_ -.]*$
        - 32
        store: false
        type: !!python/object:assemblyline.odm.base.Compound
          ai: true
          child_type: !!python/name:assemblyline.odm.models.config.Metadata ''
          copyto: []
          default: null
          default_set: false
          deprecation: null
          description: null
          getter_function: null
          index: false
          multivalued: false
          name: null
          optional: false
          parent_name: null
          setter_function: null
          store: false
    submit: {}

I'm not entirely sure, but it looks like a dump of the TypedMapping object itself, not the data. Setting any configuration or removing the INGEST key doesn't help either. It looks like one of differences between docker compose and kubernetes deployment, so you might not see it.

@kam193
Copy link
Author

kam193 commented May 29, 2024

It also explains why my manual update worked - I didn't mount the prepared configuration, and things necessary for the service registration are still equal to defaults (yeah, I know they shouldn't, but it's still a hobby setup ;)), so it worked without the real config file.

@cccs-rs
Copy link
Contributor

cccs-rs commented May 29, 2024

Hmm... so it would stand to reason that if you were to shell into the Scaler container, then running:

from assemblyline.common import forge
forge.get_config().as_primitives()

Should yield the same error/garbage output?

@kam193
Copy link
Author

kam193 commented May 30, 2024

Not exactly, but it does not return really primitives. Let's have a look, all in the scaler container of the AL v27:

>>> c = forge.get_config().as_primitives()
>>> c["submission"]
{'default_max_extracted': 500, 'default_max_supplementary': 500, 'dtl': 30, 'emptyresult_dtl': 5, 'max_dtl': 0, 'max_extraction_depth': 6, 'max_file_size': 524288000, 'max_metadata_length': 4096, 'max_temp_data_length': 4096, 'metadata': {'archive': {}, 'submit': {}, 'ingest': {'INGEST': {}}}, 'sha256_sources': [], 'file_sources': [], 'tag_types': {'attribution': ['attribution.actor', 'attribution.campaign', 'attribution.exploit', 'attribution.implant', 'attribution.family', 'attribution.network', 'av.virus_name', 'file.config', 'technique.obfuscation'], 'behavior': ['file.behavior'], 'ioc': ['network.email.address', 'network.static.ip', 'network.static.domain', 'network.static.uri', 'network.dynamic.ip', 'network.dynamic.domain', 'network.dynamic.uri']}, 'verdicts': {'info': 0, 'suspicious': 300, 'highly_suspicious': 700, 'malicious': 1000}}
>>> # ^ Looks okay
>>> d = yaml.dump(c)
>>> "!!python/object:assemblyline.odm.base.TypedMapping" in d
True
>>> d
' ... \n  metadata:\n    archive: {}\n    ingest:\n      INGEST: !!python/object:assemblyline.odm.base.TypedMapping\n        index: false\n        sanitizer: !!python/object/apply:re._compile...'
>>> # But dumped wrongly
>>> type(c)
<class 'dict'>
>>> type(c["submission"])
<class 'dict'>
>>> type(c["submission"]["metadata"])
<class 'dict'>
>>> type(c["submission"]["metadata"]["ingest"])
<class 'dict'>
>>> type(c["submission"]["metadata"]["ingest"]["INGEST"])
<class 'assemblyline.odm.base.TypedMapping'>
>>> # ^ Because it's not a dict

@cccs-rs
Copy link
Contributor

cccs-rs commented May 30, 2024

I think I know a solution to fix that 😁

@cccs-rs cccs-rs added core accepted This issue was accepted, we will work on this at some point In progress and removed assess We still haven't decided if this will be worked on or not labels May 30, 2024
@cccs-rs
Copy link
Contributor

cccs-rs commented May 30, 2024

You can try testing with the 4.5.1.dev166 release to make sure there aren't any other issues that 4.5.0.27 introduced for the Docker deployments.

@kam193
Copy link
Author

kam193 commented May 30, 2024

I confirm with this release updating works and services look like functioning as well, thanks!

BTW I love the re-design, when does it come to the stable branch?

@cccs-rs
Copy link
Contributor

cccs-rs commented May 30, 2024

I can include the patch in the next stable, once the PR is approved!

The update to the iconography was included in the 4.5.0.28 release we pushed yesterday 😁

cccs-rs added a commit to CybercentreCanada/assemblyline-core that referenced this issue Jun 3, 2024
Account for ODM representation issues when dumping to YAML
@cccs-rs
Copy link
Contributor

cccs-rs commented Jun 3, 2024

This should be patched in the 4.5.0.29 release.

@cccs-rs cccs-rs closed this as completed Jun 3, 2024
@kam193
Copy link
Author

kam193 commented Jun 3, 2024

Thanks, I've just installed it and can confirm it works :)

@kam193
Copy link
Author

kam193 commented Jun 4, 2024

Hey, I think we both didn't test the service properly... I run it with the default configuration, but not with really configured metadata. When I played with metadata, the scaler service wasn't happy again:

{"@timestamp": "2024-06-04 16:53:41,417", "event": { "module": "assemblyline", "dataset": "assemblyline.scaler" }, "host": { "ip": "x.x.x.x", "hostname": "2a55612146dd" }, "log": { "level": "INFO", "logger": "assemblyline.scaler" }, "process": { "pid": "1" }, "message": "Found the service server at: b1d0d55f6dc96d9942b27fc5dfd5060e5e5cfaca41e17b24a58faf5c425c7e84 [assemblyline-service_server-1]"}
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/var/lib/assemblyline/.local/lib/python3.11/site-packages/assemblyline_core/scaler/run_scaler.py", line 6, in <module>
    with ScalerServer() as scaler:
         ^^^^^^^^^^^^^^
  File "/var/lib/assemblyline/.local/lib/python3.11/site-packages/assemblyline_core/scaler/scaler_server.py", line 365, in __init__
    yaml.dump(json.loads(self.config.json()), handle)
                         ^^^^^^^^^^^^^^^^^^
  File "/var/lib/assemblyline/.local/lib/python3.11/site-packages/assemblyline/odm/base.py", line 1403, in json
    return json.dumps(self.as_primitives())
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/json/__init__.py", line 231, in dumps
    return _default_encoder.encode(obj)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/json/encoder.py", line 200, in encode
    chunks = self.iterencode(o, _one_shot=True)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/json/encoder.py", line 258, in iterencode
    return _iterencode(o, 0)
           ^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/json/encoder.py", line 180, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type Metadata is not JSON serializable

However, it looks to be a problem with the ingest metadata declaration (which I don't need anyway, so it's not a big issue for me), because the following configuration fails:

...
submission:
  metadata:
    submit:
      python.package_name:
        validator_type: text
        required: false
      python.uploader:
        validator_type: text
        required: false
    ingest:
      INGEST:
        python.package_name:
          validator_type: text
          required: false

but when I remove the ingest key entirely, the scaler service is working again. It looks like either I misunderstood the expected config syntax, or the as_primitives does not handle the ingest definition properly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted This issue was accepted, we will work on this at some point bug Something isn't working core In progress
Projects
None yet
Development

No branches or pull requests

2 participants