Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import null pointer information from PDG into static analysis #1086

Open
wants to merge 15 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions analysis/runtime/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -19,3 +19,4 @@ enum_dispatch = "0.3"
fs-err = "2"
crossbeam-queue = "0.3"
crossbeam-utils = "0.8"
indexmap = { version = "1.9", features = ["serde"] }
34 changes: 26 additions & 8 deletions analysis/runtime/src/events.rs
Original file line number Diff line number Diff line change
Expand Up @@ -30,10 +30,12 @@ pub enum EventKind {

CopyRef,

/// Field projection. Used for operations like `_2 = &(*_1).0`. Nested field
/// accesses like `_4 = &(*_1).x.y.z` are broken into multiple `Node`s, each
/// covering one level.
Field(Pointer, u32),
/// Projection. Used for operations like `_2 = &(*_1).0`.
/// The third value is a "projection index" that points to an element
/// of the projections data structure in the metadata. It is used to
/// disambiguate between different projections with the same pointer,
/// e.g., `(*p).x` and `(*p).x.a` where `a` is at offset 0.
Project(Pointer, Pointer, usize),

Alloc {
size: usize,
Expand Down Expand Up @@ -61,8 +63,19 @@ pub enum EventKind {
/// are necessary for its underlying address.
StoreAddrTaken(Pointer),

/// The pointer that appears as the address result of addr_of(Local)
AddrOfLocal(Pointer, Local),
/// The pointer that appears as the address result of addr_of(Local).
/// The fields encode the pointer of the local, its MIR index, and its size.
AddrOfLocal {
ptr: Pointer,
local: Local,
size: u32,
},

/// The address and size of a constant value referenced in the MIR.
AddrOfConst {
ptr: Pointer,
size: usize,
},

/// Casting the pointer to an int
ToInt(Pointer),
Expand Down Expand Up @@ -90,7 +103,9 @@ impl Debug for EventKind {
use EventKind::*;
match *self {
CopyPtr(ptr) => write!(f, "copy(0x{:x})", ptr),
Field(ptr, id) => write!(f, "field(0x{:x}, {})", ptr, id),
Project(ptr, new_ptr, idx) => {
write!(f, "project(0x{:x}, 0x{:x}, [{}])", ptr, new_ptr, idx)
}
Alloc { size, ptr } => {
write!(f, "malloc({}) -> 0x{:x}", size, ptr)
}
Expand All @@ -107,7 +122,10 @@ impl Debug for EventKind {
StoreAddr(ptr) => write!(f, "store(0x{:x})", ptr),
StoreAddrTaken(ptr) => write!(f, "store(0x{:x})", ptr),
CopyRef => write!(f, "copy_ref"),
AddrOfLocal(ptr, _) => write!(f, "addr_of_local = 0x{:x}", ptr),
AddrOfLocal { ptr, local, size } => {
write!(f, "addr_of_local({:?}, {}) = 0x{:x}", local, size, ptr)
}
AddrOfConst { ptr, size } => write!(f, "addr_of_const({}) = 0x{:x}", size, ptr),
ToInt(ptr) => write!(f, "to_int(0x{:x})", ptr),
FromInt(ptr) => write!(f, "from_int(0x{:x})", ptr),
LoadValue(ptr) => write!(f, "load_value(0x{:x})", ptr),
Expand Down
35 changes: 31 additions & 4 deletions analysis/runtime/src/handlers.rs
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,18 @@ pub fn reallocarray(mir_loc: MirLocId, old_ptr: usize, nmemb: u64, size: u64, ne
/// = note: rustdoc does not allow disambiguating between `*const` and `*mut`, and pointers are unstable until it does
/// ```
pub fn offset(mir_loc: MirLocId, ptr: usize, offset: isize, new_ptr: usize) {
// Corner case: Offset(..) events with a base pointer of zero are special
// because the result might be an actual pointer, e.g., c2rust will
// emit a pointer increment `a += b` as `a = a.offset(b)` which we need
// to ignore here if `a == 0` which is equivalent to `a = b`.
if ptr == 0 {
RUNTIME.send_event(Event {
mir_loc,
kind: EventKind::CopyPtr(offset as usize),
});
return;
}

RUNTIME.send_event(Event {
mir_loc,
kind: EventKind::Offset(ptr, offset, new_ptr),
Expand Down Expand Up @@ -110,10 +122,10 @@ pub const HOOK_FUNCTIONS: &[&str] = &[
hook_fn!(offset),
];

pub fn ptr_field(mir_loc: MirLocId, ptr: usize, field_id: u32) {
pub fn ptr_project(mir_loc: MirLocId, ptr: usize, new_ptr: usize, proj_idx: usize) {
RUNTIME.send_event(Event {
mir_loc,
kind: EventKind::Field(ptr, field_id),
kind: EventKind::Project(ptr, new_ptr, proj_idx),
});
}

Expand All @@ -138,10 +150,25 @@ pub fn ptr_to_int(mir_loc: MirLocId, ptr: usize) {
});
}

pub fn addr_of_local(mir_loc: MirLocId, ptr: usize, local: u32) {
pub fn addr_of_local(mir_loc: MirLocId, ptr: usize, local: u32, size: u32) {
RUNTIME.send_event(Event {
mir_loc,
kind: EventKind::AddrOfLocal(ptr, local.into()),
kind: EventKind::AddrOfLocal {
ptr,
local: local.into(),
size,
},
});
}

pub fn addr_of_const<T: ?Sized>(mir_loc: MirLocId, ptr: *const T) {
let size = unsafe { core::mem::size_of_val(&*ptr) };
RUNTIME.send_event(Event {
mir_loc,
kind: EventKind::AddrOfConst {
ptr: ptr as *const u8 as usize,
size,
},
});
}

Expand Down
10 changes: 9 additions & 1 deletion analysis/runtime/src/metadata.rs
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
use indexmap::IndexSet;
use std::{
collections::HashMap,
fmt::{self, Debug, Formatter},
Expand All @@ -13,6 +14,7 @@ use crate::mir_loc::{Func, FuncId, MirLoc, MirLocId};
pub struct Metadata {
pub locs: Vec<MirLoc>,
pub functions: HashMap<FuncId, String>,
pub projections: IndexSet<Vec<usize>>,
}

impl Metadata {
Expand Down Expand Up @@ -46,11 +48,17 @@ impl FromIterator<Metadata> for Metadata {
fn from_iter<I: IntoIterator<Item = Metadata>>(iter: I) -> Self {
let mut locs = Vec::new();
let mut functions = HashMap::new();
let mut projections = IndexSet::new();
for metadata in iter {
locs.extend(metadata.locs);
functions.extend(metadata.functions);
projections.extend(metadata.projections);
}
Self {
locs,
functions,
projections,
}
Self { locs, functions }
}
}

Expand Down
4 changes: 4 additions & 0 deletions analysis/tests/misc/src/pointers.rs
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,9 @@ pub unsafe extern "C" fn malloc_wrapper(mut size: size_t) -> *mut libc::c_void {
#[no_mangle]
pub unsafe extern "C" fn recur(x: libc::c_int, s: *mut S) {
if x == 0 {
if s.is_null() {
return;
}
return free(s as *mut libc::c_void);
}

Expand Down Expand Up @@ -122,6 +125,7 @@ pub unsafe extern "C" fn simple() {
let s = *y;
*x = s;
recur(3, x);
recur(3, std::ptr::null_mut());
free(x2 as *mut libc::c_void);
}

Expand Down
86 changes: 60 additions & 26 deletions c2rust-analyze/src/analyze.rs
Original file line number Diff line number Diff line change
Expand Up @@ -943,16 +943,37 @@ fn run(tcx: TyCtxt) {
let f = std::fs::File::open(pdg_file_path).unwrap();
let graphs: Graphs = bincode::deserialize_from(f).unwrap();

let mut known_nulls = HashSet::new();
for g in &graphs.graphs {
for n in &g.nodes {
let dest_pl = match n.dest.as_ref() {
Some(x) => x,
None => {
continue;
}
};
if !dest_pl.projection.is_empty() {
continue;
}
let dest = dest_pl.local;
let dest = Local::from_u32(dest.index);
if g.is_null {
known_nulls.insert((n.function.id, dest));
}
}
}

for g in &graphs.graphs {
for n in &g.nodes {
let def_path_hash: (u64, u64) = n.function.id.0.into();
let ldid = match func_def_path_hash_to_ldid.get(&def_path_hash) {
Some(&x) => x,
None => {
panic!(
eprintln!(
"pdg: unknown DefPathHash {:?} for function {:?}",
n.function.id, n.function.name
);
continue;
}
};
let info = func_info.get_mut(&ldid).unwrap();
Expand All @@ -961,6 +982,8 @@ fn run(tcx: TyCtxt) {
let mir = mir.borrow();
let acx = gacx.function_context_with_data(&mir, info.acx_data.take());
let mut asn = gasn.and(&mut info.lasn);
let mut updates_forbidden =
g_updates_forbidden.and_mut(&mut info.l_updates_forbidden);

let dest_pl = match n.dest.as_ref() {
Some(x) => x,
Expand All @@ -976,6 +999,14 @@ fn run(tcx: TyCtxt) {
let dest = dest_pl.local;
let dest = Local::from_u32(dest.index);

if acx.local_tys.get(dest).is_none() {
eprintln!(
"pdg: {}: local {:?} appears as dest, but is out of bounds",
n.function.name, dest
);
info.acx_data.set(acx.into_data());
continue;
}
let ptr = match acx.ptr_of(dest) {
Some(x) => x,
None => {
Expand All @@ -988,36 +1019,39 @@ fn run(tcx: TyCtxt) {
}
};

let node_info = match n.info.as_ref() {
Some(x) => x,
None => {
eprintln!(
"pdg: {}: node with dest {:?} is missing NodeInfo",
n.function.name, dest
);
info.acx_data.set(acx.into_data());
continue;
}
};
let allow_unsound = env::var("C2RUST_ANALYZE_PDG_ALLOW_UNSOUND")
.map(|val| val == "1")
.unwrap_or(false);

let old_perms = asn.perms()[ptr];
let mut perms = old_perms;
if node_info.flows_to.load.is_some() {
perms.insert(PermissionSet::READ);
}
if node_info.flows_to.store.is_some() {
perms.insert(PermissionSet::WRITE);
}
if node_info.flows_to.pos_offset.is_some() {
perms.insert(PermissionSet::OFFSET_ADD);
}
if node_info.flows_to.neg_offset.is_some() {
perms.insert(PermissionSet::OFFSET_SUB);
if known_nulls.contains(&(n.function.id, dest)) {
perms.remove(PermissionSet::NON_NULL);
} else if allow_unsound {
perms.insert(PermissionSet::NON_NULL);
// Unsound update: if we have never seen a NULL for
// this local in the PDG, prevent the static analysis
// from changing that permission.
updates_forbidden[ptr].insert(PermissionSet::NON_NULL);
}
if !node_info.unique {
perms.remove(PermissionSet::UNIQUE);

if let Some(node_info) = n.info.as_ref() {
if node_info.flows_to.load.is_some() {
perms.insert(PermissionSet::READ);
}
if node_info.flows_to.store.is_some() {
perms.insert(PermissionSet::WRITE);
}
if node_info.flows_to.pos_offset.is_some() {
perms.insert(PermissionSet::OFFSET_ADD);
}
if node_info.flows_to.neg_offset.is_some() {
perms.insert(PermissionSet::OFFSET_SUB);
}
if !node_info.unique {
perms.remove(PermissionSet::UNIQUE);
}
}
// TODO: PermissionSet::NON_NULL

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@spernsteiner If I'm understanding #1088 correctly, the other thing we need to do here is add NON_NULL to updates_forbidden?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Roughly, yes. Though we might want to experiment a bit - for example, it might be easier to understand the results if we set updates_forbidden only on function signatures, or only on named variables and not on temporaries.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there can be two nodes with the same dest (either in different Graphs or in the same one), representing two different observations of that dest during the execution. We want to set NON_NULL in perms (and in updates_forbidden) only if all observations with the same dest show it containing a non-null pointer. So we need some additional logic beyond just doing perms.insert(NON_NULL); updates_forbidden.insert(NON_NULL); here.

if perms != old_perms {
let added = perms & !old_perms;
Expand Down
3 changes: 2 additions & 1 deletion dynamic_instrumentation/src/arg.rs
Original file line number Diff line number Diff line change
Expand Up @@ -35,14 +35,15 @@ impl<'tcx> ArgKind<'tcx> {
#[derive(Clone, Debug)]
pub enum InstrumentationArg<'tcx> {
AddrOf(Operand<'tcx>),
Pointer(Operand<'tcx>),
Op(ArgKind<'tcx>),
}

impl<'tcx> InstrumentationArg<'tcx> {
pub fn inner(&self) -> &Operand<'tcx> {
use InstrumentationArg::*;
match self {
AddrOf(x) => x,
AddrOf(x) | Pointer(x) => x,
Op(x) => x.inner(),
}
}
Expand Down
Loading
Loading