finish up reading about abicompat

the actual guts of the comparison is rather simple, and exactly what you would expect - comparing the overlap of symbols the application needs with what the library provides and looking for differences that would render them not compatible. Signed-off-by: vsoch <vsoch@users.noreply.github.com>
vsoch · Mar 8, 2021 · 6068f7c · 6068f7c
1 parent 8573791
commit 6068f7c
Showing 1 changed file with 191 additions and 1 deletion.
diff --git a/abicompat/README.md b/abicompat/README.md
@@ -455,4 +455,194 @@ We then read variable and function descriptions via the dwarf handle:
 corpus_sptr corp = read_debug_info_into_corpus(ctxt);
 ```
 
-**stopped here, work in progress**!
+This function is defined [here](https://github.com/woodard/libabigail/blob/master/src/abg-dwarf-reader.cc#L15989).
+This means we start with properties from the elf, which the author calls "mundane"
+
+```cpp
+  // First set some mundane properties of the corpus gathered from
+  // ELF.
+  ctxt.current_corpus()->set_path(ctxt.elf_path());
+  if (is_linux_kernel(ctxt.elf_handle()))
+    ctxt.current_corpus()->set_origin(corpus::LINUX_KERNEL_BINARY_ORIGIN);
+  else
+    ctxt.current_corpus()->set_origin(corpus::DWARF_ORIGIN);
+  ctxt.current_corpus()->set_soname(ctxt.dt_soname());
+  ctxt.current_corpus()->set_needed(ctxt.dt_needed());
+  ctxt.current_corpus()->set_architecture_name(ctxt.elf_architecture());
+  if (corpus_group_sptr group = ctxt.current_corpus_group())
+    group->add_corpus(ctxt.current_corpus());
+
+```
+and then for our purposes (since this isn't being loaded in kernel mode) we set
+function and variable symbol maps [here](https://github.com/woodard/libabigail/blob/master/src/abg-dwarf-reader.cc#L16042):
+
+```cpp
+ctxt.current_corpus()->set_fun_symbol_map(ctxt.fun_syms_sptr());
+ctxt.current_corpus()->set_var_symbol_map(ctxt.var_syms_sptr());
+```
+
+both of which I think are literally taking a map of symbols and adding them to the
+corpus (example [here](https://github.com/woodard/libabigail/blob/master/src/abg-corpus.cc#L853)).
+A lot of this code (so far) has been getting/setting things that we've read from
+the ELF.
+
+We again then cut out early (and return the corpus) if [no debug info is found](https://github.com/woodard/libabigail/blob/master/src/abg-dwarf-reader.cc#L16058).
+
+#### Read declarations
+
+There are a few user input variables (e.g., the suppression file) where the user can
+specify what they want to read (or not).  At this point we compare
+declarations that are defined to these user preferences to get a final set we want
+to "export":
+
+```cpp
+// Set the set of exported declaration that are defined.
+ctxt.exported_decls_builder
+  (ctxt.current_corpus()->get_exported_decls_builder().get());
+```
+
+This is also the first mention of a DIE, which means "Dwarf Information Entry"
+and generally is a [descriptive entity in a dwarf](https://www.ibm.com/support/knowledgecenter/SSLTBW_2.4.0/com.ibm.zos.v2r4.cbcdd01/dwarfelfterminology.htm) that can describe functions, variables and types. Importantly,
+note that each DIE (aside from the tag to identify it, a section offset, a list of attributes) also has:
+
+> Nested-level indicators, which identify the parent/child relationship of the DIEs in the DIE section.
+
+So it sounds like if we can have nesting in a DIE, we would need to unpack that. In the example
+below, the part that starts with `<1>` is a child DIE of `<0>` with tag `DW_TAG_DIE02`:
+
+```
+.debug_section_name                         1
+<unit header offset =0>unit_hdr_off:        2
+<0><   11>      DW_TAG_DIE01                3
+                DW_AT_01          value00   4
+
+<1><   20>      DW_TAG_DIE02                5
+                DW_AT_01          value01   6
+                DW_AT_02          value02   
+                DW_AT_03          value03  
+```
+The name of each dwarf bug section starts with `.debug*` as we see above.
+We then (based on the example we see above) [build a DIE -> parent map](https://github.com/woodard/libabigail/blob/master/src/abg-dwarf-reader.cc#L16068).
+
+
+### 5. Building the libabigail IR
+
+The "IR" is the "internal representation" of the ABI, so I think this point should be
+where libabigail is doing something special (because so far we've just read and get/set data.
+There is where we [build the libabigail IR](https://github.com/woodard/libabigail/blob/master/src/abg-dwarf-reader.cc#L16099).
+
+We walk through each DIE (dwarf information entity) and use the context to read to that spot,
+and then it looks like we build something called a "translation unit IR":
+
+```cpp
+// Build a translation_unit IR node from cu; note that cu must
+// be a DW_TAG_compile_unit die.
+translation_unit_sptr ir_node =
+  build_translation_unit_and_add_to_ir(ctxt, &unit, address_size);
+ABG_ASSERT(ir_node);
+```
+
+The function `build_translation_unit_add_to_ir` is generating what libabigail calls
+a `abigail::translation_unit ir node`, which starts with one of these DW_TAG_compile_unit's.
+and recursively reads children and adds them to the node. It looks like the function
+[reads attributes from the die](https://github.com/woodard/libabigail/blob/master/src/abg-dwarf-reader.cc#L12873).
+
+It looks like there are a few cases for what we read when we are building a translational unit from a DIE:
+
+1. it could have name `<artificial>` meaning it's artificially generated by the compiler. Libabigail saves this and adds a suffix for the location (probably if there is another one with the same name?)
+2. it could already exist in the current corpus, because the same translation units can be repeated (with different information) and a union is taken.
+
+We then add the [result](https://github.com/woodard/libabigail/blob/master/src/abg-dwarf-reader.cc#L12911) to the 
+current translation unit and the "die_tu_map" (dwarf information entry translation unit map)?
+and call `build_ir_node_from_die` [here](https://github.com/woodard/libabigail/blob/master/src/abg-dwarf-reader.cc#L17066).
+
+> Build an IR node from a given DIE and add the node to the current
+> IR being build and held in the read_context.  Doing that is called
+> "emitting an IR node for the DIE".
+
+We also loop (while) through children until there are no more, and generate [mangled strings](https://github.com/woodard/libabigail/blob/master/src/abg-dwarf-reader.cc#L12936), etc. I wish I could actually trace variables to better understand what is going on.
+We then do [canonicalization and sorting](https://github.com/woodard/libabigail/blob/master/src/abg-dwarf-reader.cc#L16228)
+and return the corpus.
+
+On a high level, my understanding is that we've read the elf variable and function symbols and dwarf
+debugging information, and shoved it into a corpus object for later use. This is for the main
+application of interest, and this is probably the result we would get (and print out to xml)
+with `abidw`.
+
+### 6. Undefined symbols only?
+
+At this point we jump through a bunch of returns to return the application corpus
+to the calling function `get_corpus_from_elf` in `abicompat`. If the user has asked
+for only undefined symbols, we filter to that [here](https://github.com/woodard/libabigail/blob/master/tools/abicompat.cc#L694).
+(Do we need to parse and save everything if we only want undefined symbols?)
+
+We then do the exact same thing, but for the [first library of interest](https://github.com/woodard/libabigail/blob/master/tools/abicompat.cc#L726)
+to generate a second corpus, and for the [second library of interest](https://github.com/woodard/libabigail/blob/master/tools/abicompat.cc#L753)
+(if it's provided). The final check runs a function that compares either the single application and single library (weak mode)
+or single application and two libraries (not weak mode):
+
+```cpp
+  if (opts.weak_mode)
+    s = perform_compat_check_in_weak_mode(opts, ctxt,
+					  app_corpus,
+					  lib1_corpus);
+  else
+    s = perform_compat_check_in_normal_mode(opts, ctxt,
+					    app_corpus,
+					    lib1_corpus,
+					    lib2_corpus);
+```
+
+### 7. Comparing ABI Corpora
+
+At this point, we are in the [function shown above](https://github.com/woodard/libabigail/blob/master/tools/abicompat.cc#L431)
+where we aim to:
+
+> Perform a compatibility check of an application corpus and a
+> library corpus.
+> The types of the variables and functions exported by the library
+> and consumed by the application are compared with the types
+> expected by the application.  This function checks that the types
+> mean the same thing; otherwise it emits on standard output type
+> layout differences found.
+
+Intuitively, that's what I would have guessed!
+
+#### Filter down to functions and variables of interest
+
+This comes directly from the function header - basically we are only interested
+in either functions/variables that are exported by the library corpus
+and symbols undefined in the app corpus (that we would need).
+
+> Functions and variables defined and exported by lib_corpus which
+> symbols are undefined in app_corpus are the artifacts we are
+> interested in.
+
+This means that we drop all functions / variables from the library corpus
+where their symbols are not defined in the app corpus:
+
+> So let's drop all functions and variables from lib_corpus that
+> are so that their symbols are *NOT* undefined in app_corpus.
+> In other words, let's only keep the functiond and variables from
+> lib_corpus that are consumed by app_corpus.
+
+This makes sense, but if we are storing an application generally (e.g., for
+spack) we wouldn't know in advance the particular set for some library. We'd have
+to store them all (and then possibly link to a specific subset for an app).
+
+#### Compare the filtered set
+
+Now that we have a filtered set, compare functions exported by
+the library corpus to what the app corpus expects:
+
+> So we are now going to compare the functions that are exported by
+> lib_corpus against those that app_corpus expects.
+> In other words, the functions which symbols are defined by
+> lib_corpus are going to be compared to the functions and
+> variables which are undefined in app_corpus.
+
+This is also what I'd expect. We loop through functions in the library corpus and
+for the overlap look at types and versions. If expected != provided, we store
+the difference in a vector (that we will eventually print for the user). We do this
+for each of functions and variables. The `compute_diff` function requires
+that the two decels were produced in the same environment (see [here](https://github.com/woodard/libabigail/blob/master/src/abg-comparison.cc#L3122)). At the end, we print the findings to the user.