Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IH-657: Reduce XAPI code duplication #5856

Merged

Conversation

last-genius
Copy link
Contributor

Best reviewed by commit


Several fairly large modules are duplicated in XAPI - historically this happened due to the repositories being separate, more friction in adding inter-dependencies, and the additional difficulty of opening PRs in several of them, which is why it was easier to copy and paste code sometimes.

This is problematic because these have already been (or could be in the future) modified separately, without modifying (or improving, fixing) an identical interface in a different part of the repository.

This PR removes duplicated modules in:

  • ocaml/vhd-tool/src/xenstore.ml and ocaml/libs/ezxenstore/core/xenstore.ml

    • vhd-tool now uses the ezxenstore's implementation - it's somewhat newer and the only differences are in additional helper functions and logging facilities - functionally it's the same (diff ocaml/vhd-tool/src/xenstore.ml ocaml/libs/ezxenstore/core/xenstore.ml)

  • ocaml/vhd-tool/src/cohttp_unbuffered_io.ml and ocaml/xen-api-client/lwt/cohttp_unbuffered_io.ml

    • vhd-tool uses xen-api-client's implementation
      The only difference was in the type of the channel implementations used. Therefore, the module was converted into a functor.
    $ diff ocaml/vhd-tool/src/cohttp_unbuffered_io.ml ocaml/xen-api-client/lwt/cohttp_unbuffered_io.ml
    36c36
    <   ; c: Channels.t
    ---
    >   ; c: Data_channel.t
    44c44
    < type oc = Channels.t
    ---
    > type oc = Data_channel.t
    46c46
    < type conn = Channels.t
    ---
    > type conn = Data_channel.t
    50c50
    <   c.Channels.really_read tmp >>= fun () ->
    ---
    >   c.Data_channel.really_read tmp >>= fun () ->
    64c64,65
    <     let input x c = if c = x.marker.[x.i] then x.i <- x.i + 1 else x.i <- 0
    ---
    >     let input x c =
    >       if c = String.get x.marker x.i then x.i <- x.i + 1 else x.i <- 0
    68a70
    >     (* let to_string x = Printf.sprintf "%d" x.i *)
    127c129
    <   oc.Channels.really_write buf
    ---
    >   oc.Data_channel.really_write buf

  • module CBuf in ocaml/libs/xapi-stdext/lib/xapi-stdext-unix/unixext.ml and ocaml/xapi-idl/lib/posix_channel.ml

    • Xapi-idl now uses stdext-unix's module. They were identical:
    $ diff_sections ocaml/libs/xapi-stdext/lib/xapi-stdext-unix/unixext.ml 303-358 ocaml/xapi-idl/lib/posix_channel.ml 7-62
         let len = next - x.start in
         let written =
           try Unix.single_write fd x.buffer x.start len
    -      with _ ->
    +      with _e ->
             x.w_closed <- true ;
             len
         in

====

Several other duplicated modules were found during the development of this PR:

  • module Observer in ocaml/xapi-idl/xen/xenops_interface.ml and ocaml/xapi-idl/cluster/cluster_interface.ml

    This change was not undertaken as, even though the modules are identical, they capture some of the functions from external modules they're in, which are different enough to cause tests to fail.

  • module Delay in ocaml/libs/xapi-stdext/lib/xapi-stdext-threads/threadext.ml and ocaml/message-switch/unix/protocol_unix_scheduler.ml

    These are somewhat emblematic of the issue described above - one module was changed to epoll in master while the other is currently using select simply because they are duplicated. This will be fixed once the final epoll PRs are merged (See 14eca11 in [epoll] Unix.select conversion: replace with stdext modules/calls or polly calls #5705)


I've used really primitive ways of finding duplicated modules. One way to do this is:

$ grep -T -r . ocaml/**/*.ml | sort -k 2 | uniq -d -f 1 | grep module

Another way is to use duplo, which hashes parts of files to find duplicated sections of arbitrary lengths (paper), and then look for module in its massive output

$ find ocaml -type f \( -iname "*.ml" \) | duplo - out.txt 

This is obviously only looking for duplicated module names as indications of duplicated module code, doesn't capture cases of duplicated code outside of modules (or inside of modules but without identical names). This could by imitated by manually parsing through the out.txt (practically impossible), or varying the minimum duplicated lines parameter with duplo -ml XX. This was not yet undertaken as modules are easier to factor out than parts of freestanding code.

@edwintorok
Copy link
Contributor

edwintorok commented Jul 18, 2024

The other 2 commits look ok, but posix_channel I'm dropping completely in the epoll PR, so probably not worth updating.
Can you remove the posix_channel commit and leave just the other 2?

Signed-off-by: Andrii Sultanov <andrii.sultanov@cloud.com>
Signed-off-by: Andrii Sultanov <andrii.sultanov@cloud.com>
@last-genius last-genius force-pushed the private/asultanov/duplicate-removal branch from e8624ca to 6e5893b Compare July 18, 2024 15:23
@last-genius
Copy link
Contributor Author

last-genius commented Jul 19, 2024

These changes passed the BST+BVT test suites (Run 201571) - there are some failures in livepatch tests but these seem to have been happening before too. A couple of tests have been submitted but weren't started as well.

@psafont psafont merged commit fda9275 into xapi-project:master Jul 19, 2024
15 checks passed
@last-genius last-genius deleted the private/asultanov/duplicate-removal branch July 19, 2024 13:14
@last-genius last-genius restored the private/asultanov/duplicate-removal branch October 28, 2024 08:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants