Skip to content

Commit

Permalink
Add instructionAPI/Expression.h
Browse files Browse the repository at this point in the history
  • Loading branch information
hainest committed Apr 3, 2024
1 parent 6384e37 commit 966e812
Show file tree
Hide file tree
Showing 3 changed files with 107 additions and 196 deletions.
13 changes: 11 additions & 2 deletions docs/instructionAPI/public/Dereference.h.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ Dereference.h

.. cpp:namespace:: Dyninst::InstructionAPI

.. cpp:class:: Dereference
.. cpp:class:: Dereference : public Expression

**Expression for an effective address**

Expand Down Expand Up @@ -64,6 +64,8 @@ Dereference.h

Checks if this expression is the same as ``rhs``.

.. _`sec:dereference-notes`:


Notes
=====
Expand All @@ -78,7 +80,7 @@ as follows:
2. Perform analysis to determine the contents of that address.

3. If necessary, fill in the ``Dereference`` node with the contents of
that addresss, using :cpp:func:`setValue`.
that addresss, using :cpp:func:`Expression::setValue`.

The type associated with a ``Dereference`` node will be the type of the
value *read from memory*, not the type used for the address
Expand All @@ -98,3 +100,10 @@ remain unchanged.
:align: center

Applying eval to a Dereference tree with two registers having user-provided values.

This concept is demonstrated in the operand represented as ``[ ebx + 4 eax ]``.
The contents of ebx and eax have been determined through some outside mechanism, and have
been defined with :cpp:func:`Expression::setValue`. Evaluation proceeds to
determine the address being read since this information can be determined given the contents
of the registers. This address is available from the Dereference through its child in the tree,
even though calling ``eval`` on the Dereference returns a result with an undefined value.
194 changes: 94 additions & 100 deletions docs/instructionAPI/public/Expression.h.rst
Original file line number Diff line number Diff line change
@@ -1,132 +1,126 @@
.. _`sec:Expression.h`:

Expression.h
============
############

.. cpp:namespace:: Dyninst::InstructionAPI

Expression Class
----------------
.. cpp:class:: Expression : public InstructionAST

An ``Expression`` is an AST representation of how the value of an
operand is computed.
**AST representation of how the value of an operand is computed**

The ``Expression`` class extends the ``InstructionAST`` class by adding
the concept of evaluation to the nodes of an ``InstructionAST``.
Evaluation attempts to determine the ``Result`` of the computation that
the AST being evaluated represents. It will fill in results of as many
of the nodes in the tree as possible, and if full evaluation is
possible, it will return the result of the computation performed by the
tree.
.. cpp:type:: boost::shared_ptr<Expression> Ptr

Permissible leaf nodes of an ``Expression`` tree are RegisterAST and
Immediate objects. Permissible internal nodes are ``BinaryFunction`` and
Dereference objects. An ``Expression`` may represent an immediate value,
the contents of a register, or the contents of memory at a given
address, interpreted as a particular type.
A reference-counted pointer to an expression.

The ``Result``\ s in an ``Expression`` tree contain a type and a value.
Their values may be an undefined value or an instance of their
associated type. When two ``Result``\ s are combined using a
``BinaryFunction``, the ``BinaryFunction`` specifies the output type.
Sign extension, type promotion, truncation, and all other necessary
conversions are handled automatically based on the input types and the
output type. If both of the ``Result``\ s that are combined have defined
values, the combination will also have a defined value; otherwise, the
combination’s value will be undefined. For more information, see
Section `3.7 <#sec:result>`__, Section `3.10 <#sec:binaryFunction>`__,
and Section `3.11 <#sec:dereference>`__.

A user may specify the result of evaluating a given ``Expression``. This
mechanism is designed to allow the user to provide a Dereference or
RegisterAST with information about the state of memory or registers. It
may additionally be used to change the value of an Immediate or to
specify the result of a ``BinaryFunction``. This mechanism may be used
to support other advanced analyses.
.. cpp:member:: protected Result userSetValue

In order to make it more convenient to specify the results of particular
subexpressions, the ``bind`` method is provided. ``bind`` allows the
user to specify that a given subexpression has a particular value
everywhere that it appears in an expression. For example, if the state
of certain registers is known at the time an instruction is executed, a
user can ``bind`` those registers to their known values throughout an
``Expression``.
.. cpp:function:: const Result& eval() const

The evaluation mechanism, as mentioned above, will evaluate as many
sub-expressions of an expression as possible. Any operand that is more
complicated than a single immediate value, however, will depend on
register or memory values. The ``Result``\ s of evaluating each
subexpression are cached automatically using the ``setValue`` mechanism.
The ``Expression`` then attempts to determine its ``Result`` based on
the ``Result``\ s of its children. If this ``Result`` can be determined
(most likely because register contents have been filled in via
``setValue`` or ``bind``), it will be returned from ``eval``; if it can
not be determined, a ``Result`` with an undefined value will be
returned. See Figure 6 for an illustration of this concept; the operand
represented is ``[ EBX + 4 \ast EAX ]``. The contents of ``EBX`` and
``EAX`` have been determined through some outside mechanism, and have
been defined with ``setValue``. The ``eval`` mechanism proceeds to
determine the address being read by the ``Dereference``, since this
information can be determined given the contents of the registers. This
address is available from the Dereference through its child in the tree,
even though calling ``eval`` on the Dereference returns a ``Result``
with an undefined value.
Evaluates the expression and returns a :cpp:class:`Result` containing its value.

.. code-block:: cpp
Returns an undefined ``Result`` on failure. See :ref:`sec:expression-evaluation` for details.

typedef boost::shared_ptr<Expression> Ptr
.. cpp:function:: void setValue(const Result & knownValue)

A type definition for a reference-counted pointer to an ``Expression``.
Sets the evaluation result for this expression to ``knownValue``.

.. code-block:: cpp
.. cpp:function:: void clearValue()

const Result & eval() const
Sets the contents of this expression to undefined.

If the ``Expression`` can be evaluated, returns a ``Result`` containing
its value. Otherwise returns an undefined ``Result``.
The next time :cpp:func:`eval` is called, it will recalculate the value.

.. code-block:: cpp
.. cpp:function:: int size() const

const setValue(const Result & knownValue)
Returns the size of this expression’s result in **bytes**.

Sets the result of ``eval`` for this ``Expression`` to ``knownValue``.
.. cpp:function:: bool bind(Expression * expr, const Result & value)

.. code-block:: cpp
Searches for all instances of ``expr`` and sets the result for each subexpression to ``value``.

void clearValue()
Returns ``true`` if at least one instance of ``expr`` was found. See :ref:`sec:expression-binding`
for details.

``clearValue`` sets the contents of this ``Expression`` to undefined.
The next time ``eval`` is called, it will recalculate the value of the
``Expression``.
.. cpp:function:: virtual void apply(Visitor *v)

.. code-block:: cpp
Applies ``v`` in a postfix-order traversal of contained expressions (as :cpp:class:`AST`\ s)
with user-defined actions performed at each node of the tree.

int size() const
.. cpp:function:: virtual void getChildren(std::vector<Expression::Ptr> & children) const

``size`` returns the size of this ``Expression``\ ’s ``Result``, in
bytes.
Appends the children of this expression to ``children``.

.. code-block:: cpp
bool bind(Expression * expr, const Result & value)
.. cpp:function:: protected virtual bool isFlag() const

``bind`` searches for all instances of the Expression ``expr`` within
this Expression, and sets the result of ``eval`` for those
subexpressions to ``value``. ``bind`` returns ``true`` if at least one
instance of ``expr`` was found in this Expression.

``bind`` does not operate on subexpressions that happen to evaluate to
the same value. For example, if a dereference of ``0xDEADBEEF`` is bound
to 0, and a register is bound to ``0xDEADBEEF``, a deference of that
register is not bound to 0.
.. cpp:class:: DummyExpr : public Expression

virtual void apply(Visitor \*)
.. cpp:function:: protected virtual bool checkRegID(MachRegister, unsigned int = 0, unsigned int = 0) const
.. cpp:function:: protected virtual bool isStrictEqual(const InstructionAST& rhs) const

``apply`` applies a ``Visitor`` to this ``Expression``. Visitors perform
postfix-order traversal of the ASTs represented by an ``Expression``,
with user-defined actions performed at each node of the tree. We present
a thorough discussion with examples in Section `3.6 <#sec:visitor>`__.
Notes
=====
This class extends ``InstructionAST`` by adding the concept of evaluation to the nodes.
Evaluation attempts to determine the :cpp:class:`Result` of the computation that
the :cpp:class:`AST` being evaluated represents. It will fill in results of as many
of the nodes in the tree as possible, and if full evaluation is
possible, it will return the result of the computation performed by the
tree.

virtual void getChildren(std::vector<Expression::Ptr> & children) const
Permissible leaf nodes of an expression tree are :cpp:class:`RegisterAST`,
:cpp:class:`Immediate`, and :cpp:class:`TernaryAST` objects. Permissible internal nodes are :class:`BinaryFunction` and
:cpp:class:`Dereference` objects. An expression may represent an immediate value,
the contents of a register, or the contents of memory at a given
address, interpreted as a particular type.

The :cpp:class:`Result`\ s in an expression tree contain a type and a value.
Their values may be an undefined value or an instance of their
associated type. When two results are combined using a
``BinaryFunction``, it specifies the output type.
Sign extension, type promotion, truncation, and all other necessary
conversions are handled automatically based on the input types and the
output type. If both of the results that are combined have defined
values, the combination will also have a defined value. Otherwise, the
combination’s value will be undefined.

A user may specify the result of evaluating a given expression. This
mechanism is designed to allow the user to provide a ``Dereference`` or
``RegisterAST`` with information about the state of memory or registers. It
may additionally be used to change the value of an Immediate or to
specify the result of a ``BinaryFunction``. This mechanism may be used
to support other advanced analyses.

.. _`sec:expression-binding`:

Binding
^^^^^^^

In order to make it more convenient to specify the results of particular
subexpressions, the :cpp:func:`bind` method is provided. ``bind`` allows the
user to specify that a given subexpression has a particular value
everywhere that it appears in an expression. For example, if the state
of certain registers is known at the time an instruction is executed, a
user can ``bind`` those registers to their known values throughout an
expression.

.. _`sec:expression-evaluation`:

Evaluation
^^^^^^^^^^

The evaluation mechanism, as mentioned above, will evaluate as many
sub-expressions of an expression as possible. Any operand that is more
complicated than a single immediate value, however, will depend on
register or memory values. The :cpp:class:`Result`\ s of evaluating each
subexpression are cached automatically using :cpp:func:`setValue`.
The expression then attempts to determine its result based on
the results of its children. If this result can be determined
(most likely because register contents have been filled in via
``setValue`` or ``bind``), it will be returned from ``eval``. If it can
not be determined, a result with an undefined value will be
returned. It does not operate on subexpressions that happen to evaluate to the same value. For example,
if a dereference of ``0xDEADBEEF`` is bound to 0, and a register is bound to ``0xDEADBEEF``,
a deference of that register is not bound to 0.

``getChildren`` may be called on an ``Expression`` taking a vector of
``ExpressionPtr``\ s, rather than ``InstructionAST``\ Ptrs. All children
which are ``Expression``\ s will be appended to ``children``.
See the :ref:`Dereference Notes <sec:dereference-notes>` for a detailed example.
96 changes: 2 additions & 94 deletions instructionAPI/h/Expression.h
Original file line number Diff line number Diff line change
Expand Up @@ -46,81 +46,10 @@ namespace Dyninst

class Expression;
class Visitor;
/// An %Expression is an AST representation of how the value of an
/// operand is computed.
///
/// The %Expression class extends the %InstructionAST class by
/// adding the concept of evaluation to the nodes of an
/// %InstructionAST. Evaluation attempts to determine the Result
/// of the computation that the AST being evaluated represents.
/// It will fill in results of as many of the nodes in the tree as
/// possible, and if full evaluation is possible, it will return
/// the result of the computation performed by the tree.
///
/// Permissible leaf nodes of a %Expression tree are %RegisterAST
/// and %Immediate objects. Permissible internal nodes are
/// %BinaryFunction and %Dereference objects. An %Expression may
/// represent an immediate value, the contents of a register, or
/// the contents of memory at a given address, interpreted as a
/// particular type.
///
/// The %Results in an %Expression tree contain a type and a
/// value. Their values may be an undefined value or an instance
/// of their associated type. When two %Results are combined
/// using a %BinaryFunction, the %BinaryFunction specifies the
/// output type. Sign extension, type promotion, truncation, and
/// all other necessary conversions are handled automatically
/// based on the input types and the output type. If both of the
/// %Results that are combined have defined values, the
/// combination will also have a defined value; otherwise, the
/// combination's value will be undefined. For more information,
/// see Result, BinaryFunction, and Dereference.
///
/// A user may specify the result of evaluating a given
/// %Expression. This mechanism is designed to allow the user to
/// provide a %Dereference or %RegisterAST with information about
/// the state of memory or registers. It may additionally be used
/// to change the value of an %Immediate or to specify the result
/// of a %BinaryFunction. This mechanism may be used to support
/// other advanced analyses.
///
/// In order to make it more convenient to specify the results
/// of particular subexpressions, the \c bind method is provided.
/// \c bind allows the user to specify that a given subexpression
/// has a particular value everywhere that it appears in an expression.
/// For example, if the state of certain registers is known at the
/// time an instruction is executed, a user can \c bind those registers
/// to their known values throughout an %Expression.
///
/// The evaluation mechanism, as mentioned above, will evaluate as
/// many sub-expressions of an expression as possible. Any
/// operand that is more complicated than a single immediate
/// value, however, will depend on register or memory values. The
/// %Results of evaluating each subexpression are cached
/// automatically using the \c setValue mechanism. The
/// %Expression then attempts to determine its %Result based on
/// the %Results of its children. If this %Result can be
/// determined (most likely because register contents have been
/// filled in via \c setValue or \c bind), it will be returned from \c eval;
/// if it can not be determined, a %Result with an undefined value
/// will be returned. See Figure 6 for an illustration of this
/// concept; the operand represented is [ \c EBX + \c 4 * \c EAX
/// ]. The contents of \c EBX and \c EAX have been determined
/// through some outside mechanism, and have been defined with \c
/// setValue. The \c eval mechanism proceeds to determine the
/// address being read by the %Dereference, since this information
/// can be determined given the contents of the registers. This
/// address is available from the %Dereference through its child
/// in the tree, even though calling \c eval on the %Dereference
/// returns a %Result with an undefined value. \dotfile
/// deref-eval.dot "Applying \c eval to a Dereference tree with
/// the state of the registers known and the state of memory
/// unknown"
///

class INSTRUCTION_EXPORT Expression : public InstructionAST
{
public:
/// \brief A type definition for a reference counted pointer to a %Expression.
typedef boost::shared_ptr<Expression> Ptr;
protected:
Expression(Result_Type t);
Expand All @@ -129,40 +58,19 @@ namespace Dyninst
virtual ~Expression();
Expression(const Expression&) = default;

/// \brief If the %Expression can be evaluated, returns a %Result containing its value.
/// Otherwise returns an undefined %Result.
virtual const Result& eval() const;

/// \param knownValue Sets the result of \c eval for this %Expression
/// to \c knownValue
void setValue(const Result& knownValue);

/// \c clearValue sets the contents of this %Expression to undefined.
/// The next time \c eval is called, it will recalculate the value of the %Expression.
void clearValue();

/// \c size returns the size of this %Expression's %Result, in bytes.
// Size of the result in bytes
int size() const;

/// \c bind searches for all instances of the %Expression \c expr within
/// this %Expression, and sets the result of \c eval for those subexpressions
/// to \c value. \c bind returns true if at least one instance of \c expr
/// was found in this %Expression.
///
/// \c bind does not operate on subexpressions that happen to evaluate to
/// the same value. For example, if a dereference of 0xDEADBEEF is bound to
/// 0, and a register is bound to 0xDEADBEEF, a dereference of that register is not
/// bound to 0.
virtual bool bind(Expression* expr, const Result& value);


/// \c apply applies a %Visitor to this expression. %Visitors perform postfix-order
/// traversal of the ASTs represented by an %Expression, with user-defined actions performed
/// at each node of the tree.
virtual void apply(Visitor*) {}

/// \c getChildren may be called on an %Expression taking a vector of %Expression::Ptrs,
/// rather than %InstructionAST::Ptrs. All children which are %Expressions will be appended to \c children.
virtual void getChildren(std::vector<Expression::Ptr>& children) const = 0;
using InstructionAST::getChildren;

Expand Down

0 comments on commit 966e812

Please sign in to comment.