Skip to content

Commit

Permalink
Merge AFF4 logical support (#3)
Browse files Browse the repository at this point in the history
* Fix bug in map for striped.

* Make zip implementation more compliant with zip64

* Clarify hashing rules of block hasher.

* Add test case demonstrating cross container referencing.

* Remove test_memory in the short term.
Make reference_test python 3 compatible.

* Beginnings of aff4 CLI demo app
API evolution to support CLI

* Beginnings of aff4 CLI demo app
API evolution to support CLI

* Fix regression loading rekall generated memory images.

* Logical imaging prototype.
ZIP now uses UTF-8 filename storage (compatible with Windows explorer, WinRAR and 7-Zip)

* Remove debug.
Remove incomplete test.
Bump dependency versions.

* Update readme.

* Windows support for logical imaging.
Python3 fixes.

* Prevent progress messages being displayed during logical imaging.

* Bump version.

* Fix information.turtle read error.

* Detect and access winpmem logical images.
Bugfixes.

* Update copyright attribution.

* Fix Turtle serialisation to use volume ARN as base.
Make originalFileName camelhumps.

* Support AFF4 version propagation in library

* Initial AFF4 Logical append support.

* Added unit test demonstrating Push API for logical imaging

* Fix file writing on MacOS

* Fix unneeded triple persistence

* Fix unneeded triple persistence

* Fix issue with deflated zip seekable zip streams.

* Fix string serialization.

* Add logical test samples.

* Update README

* Update README

* Update README

* Fix regressions in test cases.
  • Loading branch information
blschatz authored and scudette committed Oct 24, 2018
1 parent 618ad12 commit d79d654
Show file tree
Hide file tree
Showing 38 changed files with 2,123 additions and 382 deletions.
23 changes: 14 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,27 +13,32 @@ The focus of this implementation at present is reading images conforming with th
AFF4 Standard v1.0. Canonical images are provided in the AFF4 Reference Images github
project at https://github.com/aff4/ReferenceImages

1. Reading ZipFile style volumes.
2. Reading AFF4 Image streams using the deflate or snappy compressor.
3. Reading RDF metadata using Turtle (and in some instances YAML for backwards compatibility).
1. Reading, writing & appending to ZipFile style volumes.
2. Reading striped ZipFile volumes.
2. Reading & writing AFF4 ImageStreams using the deflate or snappy compressor.
3. Reading RDF metadata using Turtle (and to some degree YAML).
4. Verification of linear and block hashed images.

## What is not yet supported.

## What is in progress

1. Reading & writing logical images.

## What is not yet supported:

The write support in the libraries is currently broken and being worked on. Other aspects of
the AFF4 that have not yet been implemented in this codebase include:

1. Encrypted AFF4 volumes.
2. Persistent data store.
3. HTTP backed streams.
4. Splitting an AFF4 Image across multiple volumes.
5. Map streams.
6. Support for signed statements or Bill of Materials.
7. Logical file acquisition.
4. Support for signed statements or Bill of Materials.


## Notice

This is not an official Google product (experimental or otherwise), it is just
code that happens to be owned by Google.
code that happens to be owned by Google and Schatz Forensic.

## References
[1] "Extending the advanced forensic format to accommodate multiple data sources,
Expand Down
270 changes: 270 additions & 0 deletions aff4.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,270 @@
# Copyright 2018 Schatz Forensic Pty Ltd. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"); you may not
# use this file except in compliance with the License. You may obtain a copy of
# the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations under
# the License.

import argparse
import sys, os, errno, shutil, uuid

from pyaff4 import container
from pyaff4 import lexicon, logical, escaping
from pyaff4 import rdfvalue, hashes, utils
from pyaff4 import block_hasher, data_store, linear_hasher

# here's the beginnings of a command line app for manipulating AFF4 images
# more of a PoC code example for now.
# author bradley@evimetry.com

VERBOSE = False
TERSE = False

def meta(file):
volume = container.Container.openURNtoContainer(rdfvalue.URN.FromFileName(file))
resolver = volume.resolver

metadataURN = volume.urn.Append("information.turtle")
try:
with resolver.AFF4FactoryOpen(metadataURN) as fd:
txt = fd.ReadAll()
print(utils.SmartUnicode(txt))
except:
pass


def list(file):
volume = container.Container.openURNtoContainer(rdfvalue.URN.FromFileName(file))

if issubclass(volume.__class__, container.PhysicalImageContainer):
printDiskImageInfo(file, volume)
elif issubclass(volume.__class__, container.LogicalImageContainer):
printLogicalImageInfo(file, volume)

def printLogicalImageInfo(file, volume):
printVolumeInfo(file, volume)
printCaseInfo(volume)

for image in volume.images():
print ("\t%s <%s>" % (image.name(), trimVolume(volume.urn, image.urn)))


def printVolumeInfo(file, volume):
volumeURN = volume.urn

print("AFF4Container: file://%s <%s>" % (file, str(volumeURN)))

def printCaseInfo(volume):
caseDetails = volume.getMetadata("CaseDetails")
if caseDetails == None:
return
print ("\tCase Description: %s" % caseDetails.caseDescription)
print ("\tCase Name: %s" % caseDetails.caseName)
print ("\tExaminer: %s" % caseDetails.examiner)

def printDiskImageInfo(file, volume):
printVolumeInfo(file, volume)
printCaseInfo(volume)

image = volume.getMetadata("DiskImage")
print ("\t%s (DiskImage)" % image.urn)
print ("\t\tSize: %s (bytes)" % image.size)
print ("\t\tSize: %s (bytes)" % image.size)
print ("\t\tSectors: %s" % image.sectorCount)
print ("\t\tBlockMapHash: %s" % image.hash)

# the following property is to test that unknown properties are handled OK
print ("\t\tUnknownproperty: %s" % image.foobar)

computerInfo = volume.getMetadata("ComputeResource")
if computerInfo != None:
print ("\tAcquisition computer details")
print ("\t\tSystem board vendor: %s" % computerInfo.systemboardVendor)
print ("\t\tSystem board serial: %s" % computerInfo.systemboardSerial)
print ("\t\tUnknownproperty: %s" % computerInfo.foobar)


class VerificationListener(object):
def __init__(self):
self.results = []

def onValidBlockHash(self, a):
pass

def onInvalidBlockHash(self, a, b, imageStreamURI, offset):
self.results.append("Invalid block hash comarison for stream %s at offset %d" % (imageStreamURI, offset))

def onValidHash(self, typ, hash, imageStreamURI):
self.results.append("Validation of %s %s succeeded. Hash = %s" % (imageStreamURI, typ, hash))

def onInvalidHash(self, typ, a, b, streamURI):
self.results.append("Invalid %s comarison for stream %s" % (typ, streamURI))

class LinearVerificationListener(object):
def __init__(self):
self.results = []

def onValidHash(self, typ, hash, imageStreamURI):
print ("\t\t%s Verified (%s)" % (typ, hash))

def onInvalidHash(self, typ, hasha, hashb, streamURI):
print ("\t\t%s Hash failure stored = %s calculated = %s)" % (typ, hasha, hashb))



def trimVolume(volume, image):
global TERSE
if TERSE:
volstring = utils.SmartUnicode(volume)
imagestring = utils.SmartUnicode(image)
if imagestring.startswith(volstring):
imagestring = imagestring[len(volstring):]
return imagestring
else:
return image


def verify(file):
volume = container.Container.openURNtoContainer(rdfvalue.URN.FromFileName(file))
printVolumeInfo(file, volume)
printCaseInfo(volume)
resolver = volume.resolver

if type(volume) == container.PhysicalImageContainer:
image = volume.image
listener = VerificationListener()
validator = block_hasher.Validator(listener)
print("Verifying AFF4 File: %s" % file)
validator.validateContainer(rdfvalue.URN.FromFileName(file))
for result in listener.results:
print("\t%s" % result)
elif type(volume) == container.LogicalImageContainer:
#print ("\tLogical Images:")
hasher = linear_hasher.LinearHasher2(resolver, LinearVerificationListener())
for image in volume.images():
print ("\t%s <%s>" % (image.name(), trimVolume(volume.urn, image.urn)))
hasher.hash(image)



def addPathNames(container_name, pathnames, recursive):
with data_store.MemoryDataStore() as resolver:
container_urn = rdfvalue.URN.FromFileName(container_name)
urn = None
with container.Container.createURN(resolver, container_urn) as volume:
print("Creating AFF4Container: file://%s <%s>" % (container_name, volume.urn))
for pathname in pathnames:
pathname = utils.SmartUnicode(pathname)
print ("\tAdding: %s" % pathname)
fsmeta = logical.FSMetadata.create(pathname)
if os.path.isdir(pathname):
image_urn = None
if volume.isAFF4Collision(pathname):
image_urn = rdfvalue.URN("aff4://%s" % uuid.uuid4())
else:
image_urn = volume.urn.Append(escaping.arnPathFragment_from_path(pathname), quote=False)

fsmeta.urn = image_urn
fsmeta.store(resolver)
resolver.Set(image_urn, rdfvalue.URN(lexicon.standard11.pathName), rdfvalue.XSDString(pathname))
resolver.Add(image_urn, rdfvalue.URN(lexicon.AFF4_TYPE), rdfvalue.URN(lexicon.standard11.FolderImage))
resolver.Add(image_urn, rdfvalue.URN(lexicon.AFF4_TYPE), rdfvalue.URN(lexicon.standard.Image))
if recursive:
for child in os.listdir(pathname):
pathnames.append(os.path.join(pathname, child))
else:
with open(pathname, "rb") as src:
hasher = linear_hasher.StreamHasher(src, [lexicon.HASH_SHA1, lexicon.HASH_MD5])
urn = volume.writeLogicalStream(pathname, hasher, fsmeta.length)
fsmeta.urn = urn
fsmeta.store(resolver)
for h in hasher.hashes:
hh = hashes.newImmutableHash(h.hexdigest(), hasher.hashToType[h])
resolver.Add(urn, rdfvalue.URN(lexicon.standard.hash), hh)
return urn

def extract(container_name, imageURNs, destFolder):
with data_store.MemoryDataStore() as resolver:
container_urn = rdfvalue.URN.FromFileName(container_name)
urn = None

with container.Container.openURNtoContainer(container_urn) as volume:
printVolumeInfo(file, volume)
resolver = volume.resolver
for imageUrn in imageURNs:
imageUrn = utils.SmartUnicode(imageUrn)

pathName = next(resolver.QuerySubjectPredicate(imageUrn, volume.lexicon.pathName))

with resolver.AFF4FactoryOpen(imageUrn) as srcStream:
if destFolder != "-":
destFile = os.path.join(destFolder, escaping.arnPathFragment_from_path(pathName.value))
if not os.path.exists(os.path.dirname(destFile)):
try:
os.makedirs(os.path.dirname(destFile))
except OSError as exc: # Guard against race condition
if exc.errno != errno.EEXIST:
raise
with open(destFile, "w") as destStream:
shutil.copyfileobj(srcStream, destStream)
print ("\tExtracted %s to %s" % (pathName.value, destFile))
else:
shutil.copyfileobj(srcStream, sys.stdout)


def main(argv):
parser = argparse.ArgumentParser(description='AFF4 command line utility.')
parser.add_argument('-v', "--verify", action="store_true",
help='verify the objects in the container')
parser.add_argument("--verbose", action="store_true",
help='enable verbose output')
parser.add_argument('-t', "--terse", action="store_true",
help='enable terse output')
parser.add_argument('-l', "--list", action="store_true",
help='list the objects in the container')
parser.add_argument('-m', "--meta", action="store_true",
help='dump the AFF4 metadata found in the container')
parser.add_argument('-f', "--folder", default=os.getcwd(),
help='the destination folder for extraction of logical images')
parser.add_argument('-r', "--recursive", action="store_true",
help='add files and folders recursively')
parser.add_argument('-c', "--create-logical", action="store_true",
help='create an AFF4 logical container containing srcFiles')
parser.add_argument('-x', "--extract", action="store_true",
help='extract objects from the container')
parser.add_argument('aff4container', help='the pathname of the AFF4 container')
parser.add_argument('srcFiles', nargs="*", help='source files and folders to add as logical image')


args = parser.parse_args()
global TERSE
global VERBOSE
VERBOSE = args.verbose
TERSE = args.terse

if args.create_logical == True:
dest = args.aff4container
addPathNames(dest, args.srcFiles, args.recursive)
elif args.meta == True:
dest = args.aff4container
meta(dest)
elif args.list == True:
dest = args.aff4container
list(dest)
elif args.verify == True:
dest = args.aff4container
verify(dest)
elif args.extract == True:
dest = args.aff4container
extract(dest, args.srcFiles, args.folder)


if __name__ == "__main__":
main(sys.argv)
2 changes: 1 addition & 1 deletion pyaff4/_version.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ def get_versions():
def raw_versions():
return json.loads("""
{
"post": "6",
"post": "7",
"version": "0.26",
"rc": "0"
}
Expand Down
12 changes: 11 additions & 1 deletion pyaff4/aff4.py
Original file line number Diff line number Diff line change
Expand Up @@ -154,7 +154,7 @@ class AFF4VolumeProperties(object):


class AFF4Object(object):
def __init__(self, resolver, urn=None):
def __init__(self, resolver, urn=None, *args, **kwargs):
self.resolver = resolver
self._dirty = False

Expand Down Expand Up @@ -217,6 +217,16 @@ def getDTB(self):
# some early images generated by Rekall don't contain a CR3
return 0

class LogicalImage(AFF4Object):

def __init__(self, resolver, volume, urn, pathName):
super(LogicalImage, self).__init__(resolver, urn)
self.volume = volume
self.pathName = pathName

def name(self):
return self.pathName



SEEK_SET = 0
Expand Down
8 changes: 5 additions & 3 deletions pyaff4/aff4_directory.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@
from pyaff4 import rdfvalue
from pyaff4 import registry
from pyaff4 import utils
from pyaff4 import escaping

LOGGER = logging.getLogger("pyaff4")

Expand All @@ -32,8 +33,9 @@ class AFF4Directory(aff4.AFF4Volume):
root_path = ""

@classmethod
def NewAFF4Directory(cls, resolver, root_urn):
def NewAFF4Directory(cls, resolver, version, root_urn):
result = AFF4Directory(resolver)
result.version = version
result.root_path = root_urn.ToFilename()

mode = resolver.Get(root_urn, lexicon.AFF4_STREAM_WRITE_MODE)
Expand Down Expand Up @@ -70,8 +72,8 @@ def CreateMember(self, child_urn):
# represent files and directories as the same path component we can not
# allow slashes in the filename. Otherwise we will fail to create
# e.g. stream/0000000 and stream/0000000/index.
filename = aff4_utils.member_name_for_urn(
child_urn, self.urn, slash_ok=False)
filename = escaping.member_name_for_urn(
child_urn, self.version, base_urn=self.urn, slash_ok=False)

# We are allowed to create any files inside the directory volume.
self.resolver.Set(child_urn, lexicon.AFF4_TYPE,
Expand Down
Loading

0 comments on commit d79d654

Please sign in to comment.