Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

November bughunt #329

Merged
merged 11 commits into from Nov 20, 2020
Merged

November bughunt #329

merged 11 commits into from Nov 20, 2020

Conversation

rakuy0
Copy link
Contributor

@rakuy0 rakuy0 commented Nov 13, 2020

Roll up of a number of things that have been bothering me for a while

  • We decode ud* correctly, but never really added a INS_* definition on it. So added those.
  • int1/icebp support since we apparently never had that
  • Making sure we don't codeflow past those and bring in a fix from my py3 branch on not code flowing past hlt instructions.
  • Even if we fail on codeblock addition, we at least we can add the metadata to the function dictionary.
  • we parse but do not make accessible the fixed file info from a PE file (should it be found)
  • Enable ARM analysis on PE files
  • pathcount in the UI had succumbed to bitrot and needed fixing.
  • some symbolik reduction code coverage
  • Address vivisect 0.1.0 loops indefinitely #322

atlas0fd00m
atlas0fd00m previously approved these changes Nov 19, 2020
Copy link
Contributor

@atlas0fd00m atlas0fd00m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got the one question about the use of dict(), but good enough for me. i'm going to approve no matter what the answer.


vw.setFunctionMeta(funcva, 'Size', size)
vw.setFunctionMeta(funcva, 'BlockCount', bcnt)
vw.setFunctionMeta(funcva, 'InstructionCount', opcount)
vw.setFunctionMeta(funcva, 'MnemDist', mnem)
vw.setFunctionMeta(funcva, 'MnemDist', dict(mnem))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a "best practices" thing? why would we create a new dict, when the old dict is not going to be used again?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually a minor display thing. I got tired of seeing the collections.defaultdict(<int>, {......} stuff in the output UI since it disables the pretty printing of that.

try:
langid = name_id & 0x3ff
sublangid = name_id >> 10
except:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i suspect this is because langid or sublangid are None? but should we be specific here for the exception?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually because name_id ends up being a unicode string because pieces of the header aren't quite right, but we can still possibly recover bits of information.

It's non-essential and I can remove it if need be. It was a nice to have that was a quick fix.

continue
except Exception as e:
logger.warn('parseOpcode error at 0x%.8x (addCodeFlow(0x%x)): %s', va, startva, e)
logger.warn('Other: parseOpcode error at 0x%.8x (addCodeFlow(0x%x)): %s', va, startva, e)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@rakuy0 rakuy0 merged commit d35d6e5 into master Nov 20, 2020
@rakuy0 rakuy0 deleted the rakuyo_bughunt_112020 branch November 20, 2020 17:13
atlas0fd00m added a commit to atlas0fd00m/vivisect that referenced this pull request Jan 14, 2021
* We decode ud* correctly, but never really added a INS_* definition on it. So added those.
* int1/icebp support since we apparently never had that
* Making sure we don't codeflow past those and bring in a fix from my py3 branch on not code flowing past hlt instructions.
* Even if we fail on codeblock addition, we at least we can add the metadata to the function dictionary.
* We parse but do not make accessible the fixed file info from a PE file (should it be 
* Enable ARM analysis on PE 
* Pathcount in the UI had succumbed to bitrot and needed 
* Some symbolik reduction code coverage
* Address vivisect#322

Co-authored-by: atlas0fd00m <atlas@r4780y.com>
atlas0fd00m added a commit that referenced this pull request Aug 26, 2022
here marks the end of an era...
fairwell oh long-toothed PR.  long live symbolik switchcase analysis!
much love and pain hath gone into thy creation and refinement.  go forth and make life better for all viv users.

* symboliks-based switch-case analysis for arbitrary platforms.  breaks on
some graph core changes, but will fix that before making a pull request.

* still truing up switch cases after merge.  added thunk_bx detection as
it is helpful in identifying variables used in switch case analysis

* more truing

* GitEye .gitignore

* DynamicBranches analysis turned into a default VaSet for every architecture, and cleanup

* unittest: Viv test_vivisect check graph edges as well as nodes.

* updating documentation and MAX case count.

* cleanup some of the functionality we pulled out of the switchcase commit
(supporting functionality has been committed in other branch/merges)

* more cleanup

* fixes/changes to use DynamicBranch VaSet for switch-case analysis

* print statements with ()'s for pyV3

* initial break up of makeSwitch.  more work to be done.

* make SwitchCases vaset for every workspace by default.

* rearranging switchcase analysis code into different functions.  UGLY but
broken up and working.  prettification to follow.

* minor bugfix: satvals not initialized if determineCaseIndex fails (not
that it matters, since the result is the same)

* bugfixes, notes, improvements

* more correct-ish analysis for ptr-sized index-deltas

* further refinements: comments

* transient - don't run from here.  working on details.

* it works again, without that nasty xref/_event_list-chop hack.

* cleanup and additions to SYMT_*

* abstracted out which registers are used to hold a image base value into
the architecture class.

* abstracted out which operands are valid for the "jmp" instruction

* moved various architecture-specific things into the Architecture module.

* update getSymbolikPathsTo to include a graph arg

* indentation bugfix

* switchcase - skip "calls"

* clean up artifacts of previous dynamic branch tracking.
BUGFIX: viv unit test

* updates for graphcore changes

* more cleanup including logging

* lots of little things.

* mark array entries as numbers (offsets) or pointers.

* test using codeblocks instead of individual instructions to determine starting point.

* lots of beautification

* reintroduce comparison of opcodes versus codeblocks.
if upper bound not set, use MAX_CASES and let the pointer-checking and xref-checking limit the upper bound.  need more testing and better options.

* more work...

* Merge branch 'master' into atlas_switchcase_analysis

# Conflicts:
#	vivisect/tools/graphutil.py

* lots of changes. working on symboliks switchcase analysis v2.0

* lots of changes to switchcase v2.  original version is broken.

* bugfix in v1 code (introduced while trying to modularize)

* updates to switchcase v2.0

* working.  improved boundary ident.  NOT USABLE!

* continues to not work...

* v2: not quite right, but making huge progress.  naming is off, and haven't verified all the wiring yet.  but we're getting through the entire process (for better or worse)

* SAVEGAME - don't use... doesn't work!  but we're making headway with v2.0

* improvements, still not good.  about to add "getNormalizedConstraints()" so we can reduce the madness.

* still not working, but getting closer.  too many moving parts, trying to capture the progress

* tweak.

* lots of headway with libc-2.13.so (32bit PIC).  hopefully didn't break everything else.

* using timed path generator (20secs), add comment at jmp, lots of debugging changes.

* breakthrough?  seem to have trued up the baseoff/lower/upper stuff.  at least 32bit libc-2.13.so seems to like it.

* working better.  filtering out non-switch cases

* bugfix: don't overwrite the thunk_bx table each time you load the workspace!

* lots of switchcase analysis cleanup.

* bugfix.  merge fail.

* fix for @rakuyo's Symboliks cleanups

* bugfix: "upper" index really needs to be included in the switchcases.

* cleanup, CASE_FAILURE, tracking completed dynbranches (reducing analysis time by reducing duplication)

* "done" list, skipping analyzing dups, improved logging

* bugfix.  skipped too early.

* update cli addition of array-based switchcases (without all the smarts and analysis)

* woops.  need this code too (link_up in particular)

* make room for symswitchcase and switchcase (the original MS VS handler) to live side-by-side.

* make non-Windows targets use Symbolik Switchcase Analysis

* disable DEBUG logging for symswitchcase.py

* minor bug fixes in ARM disassembly and emulation.  helps get rid of some of the unittest error messages (yes, that's why they're there ;) (#305)

* revsync support (#304)

* make each parser add a sha256 hash of the file loaded

* refining approach to get bytes if they're possible.

* bugfix: addFileMeta requires a filename!
also changed Vivisect Extensions such that not only will .py files in the directory path be checked for vivExtension() functions, but so will directory/__init__.py files in the extensions directory.  this is intended to allow plugin/extensions to be self-contained within a directory and be copied or symlinked into a path that's in $VIV_EXT_PATH.

* commend per snickety @rakuyo ;)

* make blog md5 calc off preloaded bytes instead of file

* py3 it! (file->open)

* bugfix since msgpack added strict_map_key and we break that (#307)

thanks @rakuy0 for verification and pushing back for best quality.

* control creation of .viv directory (#310)

* control creation of .viv directory

* docstrings and rename param

* remove unnecessary param

* Even More Syntax Cleanup (#293)

A lot of cleanup things in prep for a python 3 transition. Getting rid of the old exception syntax, converting prints over to logging, cutting random scraps of code to be proper unit tests,  cut away some older bits of code, etc.

This still works in python2. It's just a lot of tidying up. There are no major functionality changes.

* setSymKid Speedup (#309)

A 40-60 percent reduction in runtime for symbolik reduction. Makes it so if the parent cache of a symbolik object is empty when we call setSymKid, we don't traverse all the way up the tree, due to some assumptions about how the caches get populated. See the setSymKid docstring in vivisect/symboliks/common.py for more information.

* A wild changelog appeared! (#312)

Add initial changelog.

* a few mods to enhance the CLI helpers for wiring up switchcases.

* a couple bugfixes and tweaks

* normalizing analysis modules

* bugfix: old switchcase analysis was *not* a fmod.  it hooked directly into the DynamicBranchHandlers.

* minor bugfix for handling deleted codeblocks (#317)

* getOperAddr normalization (#316)

* normalizing the prototype for getOperAddr() in i386, and returning None for non-deref operand (default).

* might as well update the "abstract" base class

* more in line with other getOperAddr()

* minor change to allow access to the tuple and lists of ARM registers … (#315)

* minor change to allow access to the tuple and lists of ARM registers (used in external tools).  this brings the ARM regs.py more inline with the other architectures' regs.py

* made the changes to arm_regs and arm_regs_tup, but didn't update the references.  this actually makes accessing registers more efficient :)

* import emulator to handle dynamic branches (switchcases) using only xrefs (#314)

* modify import emulator to handle dynamic branches (switchcases) using only xrefs.

* bugfix: forgot that getBranches returns REF_CODE/BR_DEREF options which are *not* direct code branches (eg. PLT).
added __ctype_b_loc to impapi

* Fix: syntax error discovered by pytocs (#318)

* Fix: Non-terminated string constant

`visgraph/renderers/svgrend.py` is missing a terminating apostrophe on line 41.

* Incorrect extra closing parenthesis removed

* Possible missing comment character '#'

Co-authored-by: atlas0fd00m <atlas@r4780y.com>

* IMAGE_FILE defs and honor NXCOMPAT (#319)

* Add File Header Defs and honor NX compat in DLLs

* little more stricter (since exe imgs can have it set to it)

* add small unit test on the memory maps

* cleanup per @rakuyo

* first run: symbolik switchcase unittest.

* making the symswitchcase build test cases

* Msgpack storage module (that works in py2 and py3) (#321)

* add msgpack storage module

* encode mmaps

* cross version

* derp

* add unit tests for mpfile

* Bug hunting (#320)

* makePointer returns a tuple

* we bail on the first failure in carving

* unittests (and make carve mark the right dead data)

* rejigger some of the tests and fix a minor bug

* cleanup symswitchcases.py
some unittest work (not sure it's in working state yet)

* reorder/rearranging code/comments

* cleanup some more

* fixups per @rakuyo
bugfix:  thunk_bx only exists on i386, so it shouldn't be checked if it doesn't exist!
additional cleanup

* bad intel. no soup for you. (#326)

* symswitchcase and unittest updates.  not done yet.  savegame while merging in the latest master.

* one more mod, per @rakuyo (testing things out)

* Substring locations (#327)

* first stab at substring

* substring tests

* words and tests

* do the thing for unicode too

* trailing whitespace

* well found the failure

Co-authored-by: atlas0fd00m <atlas@r4780y.com>

* cobra: don't configure logging for everyone upon import (#330)

* November bughunt (#329)

* We decode ud* correctly, but never really added a INS_* definition on it. So added those.
* int1/icebp support since we apparently never had that
* Making sure we don't codeflow past those and bring in a fix from my py3 branch on not code flowing past hlt instructions.
* Even if we fail on codeblock addition, we at least we can add the metadata to the function dictionary.
* We parse but do not make accessible the fixed file info from a PE file (should it be 
* Enable ARM analysis on PE 
* Pathcount in the UI had succumbed to bitrot and needed 
* Some symbolik reduction code coverage
* Address #322

Co-authored-by: atlas0fd00m <atlas@r4780y.com>

* Speed up for setSymKid, and a few decoding fixes (#332)

* vivisect: don't configure logging within library code (#334)

this stomps on the configuration provided by applications

* cleanup

* make walker test happy (falsely!  it's all a lie!)

* added amd64/ls switchcase test
minor mods to symswitchcase
cli switch bugfix

* oops.  turn off DEBUG log-level setting

* quiet!

* set recursion limit to 5000, code and unittest cleanup

* special exception handling for being unable to determine the Complex SymIdx.
improved filtering for unittest (testelf) names, and updates for the data files, to reduce "ptr_*" names.

* mods to testelf.py

* merge fail

* bugfix in ihex vstruct defs, and improved testelf

* cleanup unittests

* symswitchcase polishing... (partway done)

* reordering a few of the functions, and adding "mid-level functions" so we now have "low," "mid," and "high" level functions.  attempting to sort out the complexity.

* cleanup symswitchcase in-module notes.

* symswitchcase.py cleanup and documenting.  almost done.

* vivisect.helpers.getTestWorkspace() now loads binaries as well as .viv files

* configurable switchcase analysis parameters

* fixed switchcase unittests.

* touch up unittests

* py3

* logging changes, and bugfix for tgtva being None...

* make switchcase work on py3 (next/__next__)

* logging changese and unit test updates (inc py3 bugfixes)

* switchcase tests!

* demoting pointer log messages from info to debug

* more unittest tweaks and improvements for symswitchcase

* * Bugfix: remove duplicate loading of each module (#374)

* Default VIV_EXT_PATH of ~/.viv/plugins
* Fix module name (no longer "viv_ext") so "from ." works correctly
* Add vw to namespace (some indication of running as an extension)

* oops, removed this as superfluous too quickly.

* dynamic dialog box helper (#376)

* dynamic dialog box helper

* actually do something with defaults, and update docstr to help.

* warning and informational messages too.

* bugfixes in QtCore and other imports

* cleanup and additions to the README

* oops, removed this prematurely.

* cleanup example gui extension

* Update README.md

thanks @williballenthin!

Co-authored-by: Willi Ballenthin <willi.ballenthin@gmail.com>

Co-authored-by: Willi Ballenthin <willi.ballenthin@gmail.com>

* supporting more than just EBX in 32-bit Intel PIC support.  this seeks out all the major registers to be used as thunks.
now we need to figure out why we *lost* switchcases from ld-2.31.so

* yay!  got 16 switches in LD, up from 12 in the old code and 10 in the most recent commit.  we had gained 4 and lost 6... now we have them all :)

* bugfix if iterJumpTable() doesn't actually iterate anything.
DynamicBranches update in PE unittest

LD unittests

* more unittest updates

* minor bugfix: check upper is not None before comparison with lower.

* tweaks

* drag out the Pathing Timeout to the SwitchCase.__init__() and increase to 60 seconds.

* moving down the road to merge-readiness..

* couple bugfixes

* helper to get early (light-weight) access to the symboliks of the last codeblock for switch-case-vetting (ie. determining that something isn't a switchcase before expending a ton of resources on it)

* refinements and small bugfixes

* wow, this has been long in coming.  i'm guessing Visi fixed the "add*Prop" but never fixed the "del*Prop"

* continued work

* minor updates, including a path-converter for using getHierPaths*() for Symboliks

* improve getFuncCbRoutedPaths:
* reduceGraph - ripping out nodes that aren't part of the desired path before pathing begin
* weighted node-checks - only check for loops when the target node weight <= current node weight

* quiet down the log messages a little

* set the timeout much higher in hopes that we can maintain a low test-time while catching 4-8 more switches.

* modified to make default timeout=45, but with the ability to rerun with higher timeouts.

* garbage collect after each `analyzeFunction()` (hope to pacify CircleCi's memory management)

* updated tests

* update tests

* gc

* undo the gc damage

* seemingly dramatic improvement on SymSwitch loop-checking

* symswitchcase config (renamed from switchcase to avoid conflict with non-symboliks switchcase info)
added `timeout_secs` and defaulted to 45 secs

* set switchcase timeout for unittests to 30secs

* tone down SymSwitch logging a little.

* clean up unittests

* bugfix in unittest

* clearRouting() in effort to limit RAM usage (to avoid the OOMs we've been getting in CircleCI)
unittest change due to limiting switchcase analysis

* try different gc parameters

* bugfix in testswitches
update data for stabilitydata

* catching testelf bins with limited timeout
updating timeouts to 10secs

* update test timeouts to 30secs (10secs completes the tests! yay!)

* tweak unittests

* add Timeout value to the SwitchCases_TimedOut VaSet (track how long we've spent trying to analyze each one)

add "Reanalyze" ContextMenu item for each va in the SwitchCases_TimedOut VaSet
add "newthread" capabilities to ACT function wrapper to fire a thread for menu actions

* main menu entry to reanalyze timed-out switchcases

* SymSwitch hackathon with @rakuy0

* reduce cli manual switchcase option which was basically superfluous.

* touch-ups per @rakuy0

* fix tests
fix comment ;)

* fix tests

* damn, i updated MockVar, not MockVw.
fixing.

* improved log messages
walker test finalized and wrapped in
better unittest-generator helpers

* mods per @rakuy0

* cleanup per @rakuy0

* cleanup per @rakuy0

* cleanup

* cleanup and relocation per @rakuy0

* last cleanup of register groups (for this PR)

* cleanup

Co-authored-by: atlas <atlas@grimm-co.com>
Co-authored-by: James Gross <45212823+rakuy0@users.noreply.github.com>
Co-authored-by: John Källén <uxmal@users.noreply.github.com>
Co-authored-by: Willi Ballenthin <william.ballenthin@fireeye.com>
Co-authored-by: Willi Ballenthin <willi.ballenthin@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants