Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modern linux types can sometimes just be levels of indirection #151

Closed
ikelos opened this issue Dec 4, 2019 · 14 comments · Fixed by #199
Closed

Modern linux types can sometimes just be levels of indirection #151

ikelos opened this issue Dec 4, 2019 · 14 comments · Fixed by #199
Assignees

Comments

@ikelos
Copy link
Member

ikelos commented Dec 4, 2019

Just so we've got it recorded somewhere...

This has been noticed specifically on linux (in the 5.3.0 kernel, at least), but certain types (such as mm_struct) can contain unnamed_fields which just act as levels of indirection (ie, they just contain another struct, unnamed, without much purpose or reason to need to access it).

    "mm_struct": {
      "size": 1024,
      "fields": {
        "cpu_bitmap": {
          "type": {
            "count": 0,
            "kind": "array",
            "subtype": {
              "kind": "base",
              "name": "long unsigned int"
            }
          },
          "offset": 1024
        },
        "unnamed_field_0": {
          "type": {
            "kind": "struct",
            "name": "unnamed_b06fa817540c10e0"
          },
          "offset": 0
        }
      },
      "kind": "struct"
    },

This makes accessing members of mmstruct difficult without knowing the precise sub-structure they're within, and we might want to contemplate a way that we can reasonably remove these if they're literally just unnamed struct members (unions and other types might need more thinking about)...

@ikelos
Copy link
Member Author

ikelos commented Jan 3, 2020

Ok, so our discussions somewhat decided that:

  • we should probably do this at the consumer level
  • it will most likely require additional information to tag unnamed fields
  • the consumer (volatility) should then traverse these fields when building member lists

@ikelos
Copy link
Member Author

ikelos commented Jan 13, 2020

Ok, I've mocked up some code whereby the entries within fields can contain offset, and type and now also anonymous. If a field is marked as anonymous, then it replaces the member with all the members of the referenced type offset by the location of the original field (recursively).

Things that still need deciding are the name for the flag (I've gone for anonymous for now) and also whether it should be a full version bump (version 8.0.0) or just a minor bump (because it's technically a schema addition) to 6.2.0. I think I've chosen right by going with 6.2.0, but remind me to check it before we're done. I also don't know whether the flag should live in the fields area, or on the structure itself? Which one's easier from your perspective/point of view? I can probably relatively easily do either.

Anyway, the branch to look at is issue151-flatten-anonymous, which doesn't contain any pdbgen changes, but contains a new schema and the ability to read the new intermed files. Shout about literally anything to do with this, I'm on holiday so happy to get it all changed whilst I've got time... 5:)

@ilch1
Copy link

ilch1 commented Jan 22, 2020

I pushed a dwarf2json branch that implements the proposed solution: https://github.com/volatilityfoundation/dwarf2json/tree/issue-11-anonymous-types

Let me know what you think.

@ikelos
Copy link
Member Author

ikelos commented Jan 22, 2020

Thanks! It looks ok, but I'm not sure it had ever been updated for schema version > 6.0.0, meaning the metadata layout is slightly different. Can you please check that generated JSON files pass by the schema validator (ensure the jsonschema python package is installed)? You can also use the development/schema_validate.py script to test it...

@ilch1
Copy link

ilch1 commented Jan 22, 2020

I'm not sure how to interpret the schema_validate.py output. Does it mean it passes?

[?] Validating file: ../dwarf2json/test/anonymous_types-new.json
INFO     volatility.schemas: Dependency for validation unavailable: jsonschema
DEBUG    volatility.schemas: All validations will report success, even with malformed input
[+] Validation successful: ../dwarf2json/test/anonymous_types-new.json
Failures []

@ikelos
Copy link
Member Author

ikelos commented Jan 23, 2020

No, it means you don't have the jsonschema package installed. If you start python and try import jsonschema I expect it will fail... 5:S

@ilch1
Copy link

ilch1 commented Feb 11, 2020

schema_validate.py failed with the following error:

DEBUG    volatility.schemas: Validating JSON against schema...
DEBUG    volatility.schemas: Schema validation error
Traceback (most recent call last):
  File "./volatility/schemas/__init__.py", line 68, in valid
    jsonschema.validate(input, schema)
  File "/Users/ilya/venv/py3/lib/python3.7/site-packages/jsonschema/validators.p
y", line 934, in validate
    raise error
jsonschema.exceptions.ValidationError: Additional properties are not allowed ('s
ource' was unexpected)

Failed validating 'additionalProperties' in schema[0]:
    {'additionalProperties': False,
     'properties': {'format': {'$ref': '#/definitions/metadata_format'},
                    'producer': {'$ref': '#/definitions/metadata_producer'}},
     'required': ['format']}

On instance:
    {'format': '6.2.0',
     'producer': {'name': 'dwarf2json', 'version': '0.6.0'},
     'source': {'file': 'anonymous_types.dSYM/Contents/Resources/DWARF/anonymous
_types',
                'type': 'dwarf'}}

I needed to apply the patch below for it to pass.

diff --git a/volatility/schemas/schema-6.2.0.json b/volatility/schemas/schema-6.2.0.json
index 931bae76..38abc46a 100644
--- a/volatility/schemas/schema-6.2.0.json
+++ b/volatility/schemas/schema-6.2.0.json
@@ -91,6 +91,9 @@
       "properties": {
         "type": {
           "type": "string"
+        },
+        "file": {
+          "type": "string"
         }
       }
     },
@@ -104,6 +107,9 @@
             },
             "producer": {
               "$ref": "#/definitions/metadata_producer"
+            },
+            "source": {
+              "$ref": "#/definitions/metadata_source"
             }
           },
           "required": [

@ikelos
Copy link
Member Author

ikelos commented Feb 11, 2020

Hiya @ilch1 so I think we deprecated source in favour of OS specific sections (such as the linux one). I'm not sure source_metadata is actually referenced anywhere in the schema any more?

Storing the data under these sections would allow us to store more OS-specific information in a more accessible format (as we've done for the pe/pdb information under windows). I'm also not sure that the file name that was used to produce the JSON is necessarily useful for the consumer (other than as an identifier to distinguish it from others, so perhaps we should add a GUID to each generated file)? It also gives the opportunity to leak information (such as /home/mike/stuff/...). So what's the issue that the filename is being included to solve?

@ilch1
Copy link

ilch1 commented Mar 4, 2020

I pushed a commit to this branch that adds mac-specific metadata to the schema. The corresponding branch in dwarf2json produces compatible output.

@ilch1
Copy link

ilch1 commented Mar 26, 2020

schema-6.2.0.json has a couple of typos. Here's the diff with the fix:

diff --git a/volatility/schemas/schema-6.2.0.json b/volatility/schemas/schema-6.2.0.json
index d2472ec9..1f388005 100644
--- a/volatility/schemas/schema-6.2.0.json
+++ b/volatility/schemas/schema-6.2.0.json
@@ -84,13 +84,13 @@
         "symbols": {
           "type": "array",
           "items": {
-            "$ref": "#/definitions/metadata_linux_item"
+            "$ref": "#/definitions/metadata_nix_item"
           }
         },
         "types": {
           "type": "array",
           "items": {
-            "$ref": "#/definitions/metadata_linux_item"
+            "$ref": "#/definitions/metadata_nix_item"
           }
         }
       },

The latest issue-11-anonymous-types branch in dwarf2json should work with this schema. Let me know if you run into any problems.

@ikelos
Copy link
Member Author

ikelos commented Mar 26, 2020

Thanks, I made the changes you suggested, so if you're happy that it all works now I can merge it tomorrow... 5;)

@ilch1
Copy link

ilch1 commented Mar 26, 2020

I've merged the corresponding change in dwarf2json

@ikelos
Copy link
Member Author

ikelos commented Mar 26, 2020

Thanks, I just merged the other changes. Feel free to reopen this if someone spots anything weird going on with anonymous types...

@ikelos
Copy link
Member Author

ikelos commented Mar 26, 2020

We might also want to re-run dwarf2json again the various kernels in the linux.zip now?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants