Fix issues with Docker setup by mabhay3420 · Pull Request #2235 · google/heir

mabhay3420 · 2025-09-18T17:29:15Z

Disclaimer: Mostly AI generated code

Main goal is to fix libomp-dev and clang related issues on ubuntu-amd64 hosts ( #2234) . The fixes are taken from #2179. )
Few other improvements to make dev experience better:

The instructions in docker/Readme is converted to a script. We also specifically mention linux/amd64 as a platform since on macos by default linux/arm64 will be chosen - which resulted in errors for me.
Since it's expected that user has already cloned the repo, we simply mount the project dir inside container instead of downloading it again. this also means the setup can be used as a persisted dev environment.

I expect to remove the extras if they are considered not useful or should be reviewed separately.
Also how do we test in a clean environment?

google-cla · 2025-09-18T17:29:21Z

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

j2kun · 2025-09-18T17:38:43Z

I'm not a docker expert and this dockerfile is best-effort, so if it works for you and @ai-mannamalai then it's fine by me.

Also how do we test in a clean environment?

bazel test //...:all will build and test everything, though bazel build //tools:heir-opt will build the minimal target needed to do dev.

j2kun · 2025-09-18T17:39:18Z

Please squash all commits into one and I can merge it.

mabhay3420 · 2025-09-18T17:43:44Z

The tools work on my end:

Details

heiruser@b3721724568a:~/heir$ bazelisk build //tools:heir-opt //tools:heir-translate
INFO: Analyzed 2 targets (32 packages loaded, 321 targets configured).
INFO: Found 2 targets...
INFO: Elapsed time: 9.227s, Critical Path: 0.36s
INFO: 1 process: 9 action cache hit, 1 internal.
INFO: Build completed successfully, 1 total action
heiruser@b3721724568a:~/heir$ ./bazel-bin/ 
external/ lib/      tools/    
heiruser@b3721724568a:~/heir$ ./bazel-bin/tools/heir-opt --help
OVERVIEW: HEIR Pass Driver
Available Dialects: affine, arith, bgv, bufferization, builtin, cggi, ckks, comb, emitc, func, jaxite, jaxiteword, lattigo, linalg, llvm, lwe, math, math_ext, memref, mgmt, mod_arith, openfhe, polynomial, random, rns, scf, secret, tensor, tensor_ext, tfhe_rust, tfhe_rust_bool, tosa
USAGE: heir-opt [options] <input file>

OPTIONS:

Color Options:

  --color                                               - Use colors in output (default=autodetect)

General options:

  --allow-unregistered-dialect                          - Allow operation with no registered dialects
  --disable-i2p-p2i-opt                                 - Disables inttoptr/ptrtoint roundtrip optimization
  --dump-pass-pipeline                                  - Print the pipeline that will be run
  --elide-resource-data-from-bytecode                   - Elide resources when generating bytecode
  --emit-bytecode                                       - Emit bytecode when generating output
  --emit-bytecode-version=<value>                       - Use specified bytecode when generating output
  --irdl-file=<filename>                                - IRDL file to register before processing the input
  --list-passes                                         - Print the list of registered passes and exit
  --load-dialect-plugin=<string>                        - Load dialects from plugin library
  --load-pass-plugin=<string>                           - Load passes from plugin library
  --log-actions-to=<string>                             - Log action execution to a file, or stderr if  '-' is passed
  --log-mlir-actions-filter=<string>                    - Comma separated list of locations to filter actions from logging
  --mlir-debug-counter=<string>                         - Comma separated list of debug counter skip and count arguments
  --mlir-diagnostic-verbosity-level=<value>             - Choose level of diagnostic information
    =errors                                             -   Errors only
    =warnings                                           -   Errors and warnings
    =remarks                                            -   Errors, warnings and remarks
  --mlir-disable-diagnostic-notes                       - Disable diagnostic notes.
  --mlir-disable-threading                              - Disable multi-threading within MLIR, overrides any further call to MLIRContext::enableMultiThreading()
  --mlir-elide-elementsattrs-if-larger=<uint>           - Elide ElementsAttrs with "..." that have more elements than the given upper limit
  --mlir-elide-resource-strings-if-larger=<uint>        - Elide printing value of resources if string is too long in chars.
  --mlir-enable-debugger-hook                           - Enable Debugger hook for debugging MLIR Actions
  --mlir-generate-reproducer=<filename>                 - Generate an mlir reproducer at the provided filename (no crash required)
  --mlir-output-format=<value>                          - Output format for timing data
    =text                                               -   display the results in text format
    =json                                               -   display the results in JSON format
  --mlir-pass-pipeline-crash-reproducer=<string>        - Generate a .mlir reproducer file at the given output path if the pass manager crashes or fails
  --mlir-pass-pipeline-local-reproducer                 - When generating a crash reproducer, attempt to generated a reproducer with the smallest pipeline.
  --mlir-pass-statistics                                - Display the statistics of each pass
  --mlir-pass-statistics-display=<value>                - Display method for pass statistics
    =list                                               -   display the results in a merged list sorted by pass name
    =pipeline                                           -   display the results with a nested pipeline view
  --mlir-pretty-debuginfo                               - Print pretty debug info in MLIR output
  --mlir-print-debug-counter                            - Print out debug counter information after all counters have been accumulated
  --mlir-print-debuginfo                                - Print debug info in MLIR output
  --mlir-print-elementsattrs-with-hex-if-larger=<long>  - Print DenseElementsAttrs with a hex string that have more elements than the given upper limit (use -1 to disable)
  --mlir-print-ir-after=<pass-arg>                      - Print IR after specified passes
  --mlir-print-ir-after-all                             - Print IR after each pass
  --mlir-print-ir-after-change                          - When printing the IR after a pass, only print if the IR changed
  --mlir-print-ir-after-failure                         - When printing the IR after a pass, only print if the pass failed
  --mlir-print-ir-before=<pass-arg>                     - Print IR before specified passes
  --mlir-print-ir-before-all                            - Print IR before each pass
  --mlir-print-ir-module-scope                          - When printing IR for print-ir-[before|after]{-all} always print the top-level operation
  --mlir-print-ir-tree-dir=<string>                     - When printing the IR before/after a pass, print file tree rooted at this directory. Use in conjunction with mlir-print-ir-* flags
  --mlir-print-local-scope                              - Print with local scope and inline information (eliding aliases for attributes, types, and locations)
  --mlir-print-op-on-diagnostic                         - When a diagnostic is emitted on an operation, also print the operation as an attached note
  --mlir-print-skip-regions                             - Skip regions when printing ops.
  --mlir-print-stacktrace-on-diagnostic                 - When a diagnostic is emitted, also print the stack trace as an attached note
  --mlir-print-unique-ssa-ids                           - Print unique SSA ID numbers for values, block arguments and naming conflicts across all regions
  --mlir-print-value-users                              - Print users of operation results and block arguments as a comment
  --mlir-timing                                         - Display execution times
  --mlir-timing-display=<value>                         - Display method for timing data
    =list                                               -   display the results in a list sorted by total time
    =tree                                               -   display the results ina with a nested tree view
  --mlir-use-nameloc-as-prefix                          - Print SSA IDs using NameLocs as prefixes
  --mlir-very-unsafe-disable-verifier-on-parsing        - Disable the verifier on parsing (very unsafe)
  --no-implicit-module                                  - Disable implicit addition of a top-level module op during parsing
  -o <filename>                                         - Output filename
  --output-split-marker=<string>                        - Split marker to use for merging the ouput
  --pass-pipeline=<string>                              - Textual description of the pass pipeline to run
  --profile-actions-to=<string>                         - Profile action execution to a file, or stderr if  '-' is passed
  --run-reproducer                                      - Run the pipeline stored in the reproducer
  Compiler passes to run
    Passes:
      --add-client-interface                            -   Add client interfaces to secret functions
        --ciphertext-size=<int>                         - Power of two length of the ciphertexts the data is packed in.
        --enable-layout-assignment                      - If false, skips the emission of layout assignment operations, essentially assuming that the input was already using correctly (ciphertext-)sized tensors.
      --affine-data-copy-generate                       -   Generate explicit copying for affine memory operations
        --fast-mem-capacity=<ulong>                     - Set fast memory space capacity in KiB (default: unlimited)
        --fast-mem-space=<uint>                         - Fast memory space identifier for copy generation (default: 1)
        --generate-dma                                  - Generate DMA instead of point-wise copy
        --min-dma-transfer=<int>                        - Minimum DMA transfer size supported by the target in bytes
        --skip-non-unit-stride-loops                    - Testing purposes: avoid non-unit stride loop choice depths for copy placement
        --slow-mem-space=<uint>                         - Slow memory space identifier for copy generation (default: 0)
        --tag-mem-space=<uint>                          - Tag memory space identifier for copy generation (default: 0)
      --affine-expand-index-ops                         -   Lower affine operations operating on indices into more fundamental operations
      --affine-expand-index-ops-as-affine               -   Lower affine operations operating on indices into affine.apply operations
      --affine-loop-coalescing                          -   Coalesce nested loops with independent bounds into a single loop
      --affine-loop-fusion                              -   Fuse affine loop nests
        --compute-tolerance=<number>                    - Fractional increase in additional computation tolerated while fusing
        --fast-mem-space=<uint>                         - Faster memory space number to promote fusion buffers to
        --local-buf-threshold=<ulong>                   - Threshold size (KiB) for promoting local buffers to fast memory space
        --maximal                                       - Enables maximal loop fusion
        --mode=<value>                                  - fusion mode to attempt
    =greedy                                       -   Perform greedy (both producer-consumer and sibling)  fusion
    =producer                                     -   Perform only producer-consumer fusion
    =sibling                                      -   Perform only sibling fusion
      --affine-loop-invariant-code-motion               -   Hoist loop invariant instructions outside of affine loops
      --affine-loop-normalize                           -   Apply normalization transformations to affine loop-like ops
        --promote-single-iter                           - Promote single iteration loops
      --affine-loop-tile                                -   Tile affine loop nests
        --cache-size=<ulong>                            - Set size of cache to tile for in KiB (default: 512)
        --separate                                      - Separate full and partial tiles (default: false)
        --tile-size=<uint>                              - Use this tile size for all loops
        --tile-sizes=<uint>                             - List of tile sizes for each perfect nest (overridden by -tile-size)
      --affine-loop-unroll                              -   Unroll affine loops
        --cleanup-unroll                                - Fully unroll the cleanup loop when possible.
        --unroll-factor=<uint>                          - Use this unroll factor for all loops being unrolled
        --unroll-full                                   - Fully unroll loops
        --unroll-full-threshold=<uint>                  - Unroll all loops with trip count less than or equal to this
        --unroll-num-reps=<uint>                        - Unroll innermost loops repeatedly this many times
        --unroll-up-to-factor                           - Allow unrolling up to the factor specified
      --affine-loop-unroll-jam                          -   Unroll and jam affine loops
        --unroll-jam-factor=<uint>                      - Use this unroll jam factor for all loops (default 4)
      --affine-parallelize                              -   Convert affine.for ops into 1-D affine.parallel
        --max-nested=<uint>                             - Maximum number of nested parallel loops to produce. Defaults to unlimited (UINT_MAX).
        --parallel-reductions                           - Whether to parallelize reduction loops. Defaults to false.
      --affine-pipeline-data-transfer                   -   Pipeline non-blocking data transfers between explicitly managed levels of the memory hierarchy
      --affine-raise-from-memref                        -   Turn some memref operators to affine operators where supported
      --affine-scalrep                                  -   Replace affine memref accesses by scalars by forwarding stores to loads and eliminating redundant loads
      --affine-simplify-min-max                         -   Simplify affine min/max/apply
      --affine-simplify-structures                      -   Simplify affine expressions in maps/sets and normalize memrefs
      --affine-super-vectorize                          -   Vectorize to a target independent n-D vector abstraction
        --test-fastest-varying=<long>                   - Specify a 1-D, 2-D or 3-D pattern of fastest varying memory dimensions to match. See defaultPatterns in Vectorize.cpp for a description and examples. This is used for testing purposes
        --vectorize-reductions                          - Vectorize known reductions expressed via iter_args. Switched off by default.
        --virtual-vector-size=<long>                    - Specify an n-D virtual vector size for vectorization. This must be greater than zero.
      --annotate-mgmt                                   -   Annotate MgmtAttr for secret SSA values in the IR
        --base-level=<int>                              - Level to start counting from (used by B/FV)
      --annotate-module                                 -   Annotate ModuleOp with Scheme And/Or Backend
        --backend=<string>                              - The backend to annotate the module with.
        --scheme=<string>                               - The scheme to annotate the module with.
      --annotate-secretness                             -   Annotate secret SSA values in the IR
        --verbose                                       - If true, annotate secretness state all values, including public ones, and values with missing or inconclusive analysis.
      --apply-folders                                   -   Apply all folding patterns from canonicalize
      --arith-to-cggi                                   -   Lower `arith` to `cggi` dialect.
      --arith-to-cggi-quart                             -   Lower `arith` to `cggi` dialect and divide each operation into smaller parts.
      --arith-to-mod-arith                              -   Lower standard `arith` to `mod-arith`.
        --modulus=<long>                                - Modulus to use for the mod-arith dialect. If not specified, the pass will use the natural modulus for that integer type
      --bgv-to-lwe                                      -   Lower `bgv` to `lwe` dialect.
      --buffer-deallocation-simplification              -   Optimizes `bufferization.dealloc` operation for more efficient codegen
      --buffer-hoisting                                 -   Optimizes placement of allocation operations by moving them into common dominators and out of nested regions
      --buffer-loop-hoisting                            -   Optimizes placement of allocation operations by moving them out of loop nests
      --buffer-results-to-out-params                    -   Converts memref-typed function results to out-params
        --add-result-attr                               - Add the attribute 'bufferize.result' to all output parameters.
        --hoist-static-allocs                           - Hoist static allocations to call sites.
      --bufferization-lower-deallocations               -   Lowers `bufferization.dealloc` operations to `memref.dealloc`operations
      --canonicalize                                    -   Canonicalize operations
        --disable-patterns=<string>                     - Labels of patterns that should be filtered out during application
        --enable-patterns=<string>                      - Labels of patterns that should be used during application, all other patterns are filtered out
        --max-iterations=<long>                         - Max. iterations between applying patterns / simplifying regions
        --max-num-rewrites=<long>                       - Max. number of pattern rewrites within an iteration
        --region-simplify=<value>                       - Perform control flow optimizations to the region tree
    =disabled                                     -   Don't run any control-flow simplification.
    =normal                                       -   Perform simple control-flow simplifications (e.g. dead args elimination).
    =aggressive                                   -   Perform aggressive control-flow simplification (e.g. block merging).
        --test-convergence                              - Test only: Fail pass on non-convergence to detect cyclic pattern
        --top-down                                      - Seed the worklist in general top-down order
      --cggi-boolean-vectorize                          -   Group different logic gates with the packed API
        --parallelism=<int>                             - Parallelism factor for batching. 0 is infinite parallelism
      --cggi-decompose-operations                       -   Expands CGGI operations into LWE operations and programmable bootstraps
        --expand-lincomb                                - Expand lincomb operations to the PBS and scalar level
      --cggi-to-jaxite                                  -   Lower `cggi` to `jaxite` dialect.
      --cggi-to-tfhe-rust                               -   Lower `cggi` to `tfhe_rust` dialect.
      --cggi-to-tfhe-rust-bool                          -   Lower `cggi` to `tfhe_rust_bool` dialect.
      --ckks-to-lwe                                     -   Lower `ckks` to `lwe` dialect.
      --collapse-insertion-chains                       -   Collapse chains of extract/insert ops into rotate ops when possible
      --compare-to-sign-rewrite                         -   Rewrites arith.cmpi/arith.cmpf to a math_ext.sign based expression
      --composite-fixed-point-pass                      -   Composite fixed point pass
        --max-iterations=<int>                          - Maximum number of iterations if inner pipeline
        --name=<string>                                 - Composite pass display name
        --pipeline=<string>                             - Composite pass inner pipeline
      --control-flow-sink                               -   Sink operations into conditional blocks
      --convert-arith-to-llvm                           -   Convert Arith dialect to LLVM dialect
        --index-bitwidth=<uint>                         - Bitwidth of the index type, 0 to use size of machine word
      --convert-cf-to-llvm                              -   Convert ControlFlow operations to the LLVM dialect
        --index-bitwidth=<uint>                         - Bitwidth of the index type, 0 to use size of machine word
      --convert-elementwise-to-affine                   -   This pass lowers ElementwiseMappable operations to Affine loops.
        --convert-dialects=<string>                     - comma-separated list of dialects to run this pass on 
        --convert-ops=<string>                          - comma-separated list of ops to run this pass on 
      --convert-elementwise-to-linalg                   -   Convert ElementwiseMappable ops to linalg
      --convert-func-to-llvm                            -   Convert from the Func dialect to the LLVM dialect
        --index-bitwidth=<uint>                         - Bitwidth of the index type, 0 to use size of machine word
        --use-bare-ptr-memref-call-conv                 - Replace FuncOp's MemRef arguments with bare pointers to the MemRef element types
      --convert-if-to-select                            -   Convert scf.if operations on secret conditions to arith.select operations.
      --convert-linalg-to-affine-loops                  -   Lower the operations from the linalg dialect into affine loops
      --convert-linalg-to-loops                         -   Lower the operations from the linalg dialect into loops
      --convert-linalg-to-parallel-loops                -   Lower the operations from the linalg dialect into parallel loops
      --convert-polynomial-mul-to-ntt                   -   Rewrites polynomial operations to their NTT equivalents
      --convert-scf-to-cf                               -   Convert SCF dialect to ControlFlow dialect, replacing structured control flow with a CFG
      --convert-secret-extract-to-static-extract        -   Convert `tensor.extract` operations on secret index to static extract operations.
      --convert-secret-for-to-static-for                -   Convert secret scf.for ops to affine.for ops with constant bounds.
        --convert-all-scf-for                           - If true, convert all scf.for ops to affine.for, not just those with secret bounds.
      --convert-secret-insert-to-static-insert          -   Convert `tensor.insert` operations on secret index to static insert operations.
      --convert-secret-while-to-static-for              -   Convert secret scf.while ops to affine.for ops that have constant bounds.
      --convert-tensor-to-scalars                       -   Effectively 'unrolls' tensors of static shape to scalars.
        --max-size=<int>                                - Limits `unrolling` to tensors with at most max-size elements
      --convert-to-ciphertext-semantics                 -   Converts programs with tensor semantics to ciphertext semantics
        --ciphertext-size=<int>                         - Power of two length of the ciphertexts the data is packed in.
      --convert-to-llvm                                 -   Convert to LLVM via dialect interfaces found in the input IR
        --allow-pattern-rollback                        - Experimental performance flag to disallow pattern rollback
        --dynamic                                       - Use op conversion attributes to configure the conversion
        --filter-dialects=<string>                      - Test conversion patterns of only the specified dialects
      --cse                                             -   Eliminate common sub-expressions
      --drop-equivalent-buffer-results                  -   Remove MemRef return values that are equivalent to a bbArg
      --drop-unit-dims                                  -   Drops unit dimensions from linalg ops.
      --eliminate-empty-tensors                         -   Try to eliminate all tensor.empty ops.
      --emit-c-interface                                -   Adds llvm.emit_c_interface to each public function.
      --empty-tensor-to-alloc-tensor                    -   Replace all empty ops by alloc_tensor ops.
      --expand-copy                                     -   Expands memref.copy ops to explicit affine loads and stores
        --disable-affine-loop                           - Use this to control to disable using affine loops
      --extract-loop-body                               -   Extracts logic of a loop bodies into functions.
        --min-body-size=<uint>                          - Use this to control the minimum loop body size to apply this pass
        --min-loop-size=<uint>                          - Use this to control the minimum loop size to apply this pass
      --finalize-memref-to-llvm                         -   Finalize MemRef dialect to LLVM dialect conversion
        --index-bitwidth=<uint>                         - Bitwidth of the index type, 0 to use size of machine word
        --use-aligned-alloc                             - Use aligned_alloc in place of malloc for heap allocations
        --use-generic-functions                         - Use generic allocation and deallocation functions instead of the classic 'malloc', 'aligned_alloc' and 'free' functions
      --fold-constant-tensors                           -   This pass folds any constant tensors.
      --fold-convert-layout-into-assign-layout          -   Merges tensor_ext.convert_layout ops into preceding tensor_ext.assign_layout ops
      --forward-insert-to-extract                       -   Forward inserts to extracts within a single block
      --forward-store-to-load                           -   Forward stores to loads within a single block
      --full-loop-unroll                                -   Fully unroll all loops
      --generate-param-bfv                              -   Generate BFV Scheme Parameter
        --encryption-technique-extended                 - If true, uses EXTENDED encryption technique for encryption. (See https://ia.cr/2022/915)
        --mod-bits=<int>                                - Default number of bits for all prime coefficient modulusto use for the ciphertext space.
        --model=<string>                                - Noise model to validate against.
        --plaintext-modulus=<long>                      - Plaintext modulus.
        --slot-number=<int>                             - Minimum number of slots for parameter generation.
        --use-public-key                                - If true, uses a public key for encryption.
      --generate-param-bgv                              -   Generate BGV Scheme Parameter using a given noise model
        --encryption-technique-extended                 - If true, uses EXTENDED encryption technique for encryption. (See https://ia.cr/2022/915)
        --model=<string>                                - Noise model to validate against.
        --plaintext-modulus=<long>                      - Plaintext modulus.
        --slot-number=<int>                             - Minimum number of slots for parameter generation.
        --use-public-key                                - If true, uses a public key for encryption.
      --generate-param-ckks                             -   Generate CKKS Scheme Parameter
        --encryption-technique-extended                 - If true, uses EXTENDED encryption technique for encryption. (See https://ia.cr/2022/915)
        --first-mod-bits=<int>                          - Default number of bits of the first prime coefficient modulus to use for the ciphertext space.
        --input-range=<int>                             - The range of the plaintexts for input ciphertexts for the CKKS scheme; default to [-1, 1]. For other ranges like [-D, D], use D.
        --scaling-mod-bits=<int>                        - Default number of bits of the scaling prime coefficient modulus to use for the ciphertext space.
        --slot-number=<int>                             - Minimum number of slots for parameter generation.
        --use-public-key                                - If true, uses a public key for encryption.
      --generate-runtime-verification                   -   Generate additional runtime op verification checks
      --implement-rotate-and-reduce                     -   Implement tensor_ext.rotate_and_reduce ops with baby-steps / giant-steps
      --implement-shift-network                         -   Implement tensor_ext.convert_layout ops as shift newtorks
        --ciphertext-size=<int>                         - Power of two length of the ciphertexts the data is packed in.
      --inline                                          -   Inline function calls
        --default-pipeline=<string>                     - The optimizer pipeline used for callables that do not have a dedicated optimizer pipeline in opPipelineList
        --inlining-threshold=<uint>                     - If the ratio between the number of the operations in the callee and the number of the operations in the caller exceeds this value (in percentage), then the callee is not inlined even if it is legal to inline it
        --max-iterations=<uint>                         - Maximum number of iterations when inlining within an SCC
        --op-pipelines=<pass-manager>                   - Callable operation specific optimizer pipelines (in the form of `dialect.op(pipeline)`)
      --inline-activations                              -   Inline activation functions
      --insert-rotate                                   -   Vectorize arithmetic FHE operations using HECO-style heuristics
      --lattigo-alloc-to-inplace                        -   Convert AllocOps to InplaceOps in Lattigo
      --lattigo-configure-crypto-context                -   Configure the crypto context in Lattigo
        --entry-function=<string>                       - Default entry function name of entry function.
      --layout-optimization                             -   Optimize layout conversions in the IR
        --ciphertext-size=<int>                         - Power of two length of the ciphertexts the data is packed in.
      --layout-propagation                              -   Propagate ciphertext layouts through the IR
        --ciphertext-size=<int>                         - Power of two length of the ciphertexts the data is packed in.
      --linalg-block-pack-matmul                        -   Convert linalg matmul ops to block layout and back
        --allow-padding                                 - Allow packing padding
        --block-factors=<long>                          - Block factors (mb, nb, kb) for relayout
        --lhs-transpose-inner-blocks                    - Transpose LHS inner block layout [mb][kb] -> [kb][mb]
        --lhs-transpose-outer-blocks                    - Transpose LHS outer block layout [MB][KB] -> [KB][MB]
        --mnk-order=<long>                              - Permutation of matmul (M, N, K) dimensions order
        --mnk-padded-multiples=<long>                   - Next multiples of the packing sizes
        --rhs-transpose-inner-blocks                    - Transpose RHS inner block layout [kb][nb] -> [nb][kb]
        --rhs-transpose-outer-blocks                    - Transpose RHS outer block layout [KB][NB] -> [NB][KB]
      --linalg-canonicalizations                        -   This pass canonicalizes the linalg.transpose operation of a constant into a transposed constant.
      --linalg-detensorize                              -   Detensorize linalg ops
        --aggressive-mode                               - Detensorize all ops that qualify for detensoring along with branch operands and basic-block arguments.
      --linalg-fold-into-elementwise                    -   Fold transform, broadcast and other ops into elementwise
      --linalg-fold-unit-extent-dims                    -   Remove unit-extent dimension in Linalg ops on tensors
        --use-rank-reducing-slices                      - Generate rank-reducing slices instead of reassociative reshapes
      --linalg-fuse-elementwise-ops                     -   Fuse elementwise operations on tensors
      --linalg-generalize-named-ops                     -   Convert named ops into generic ops
      --linalg-inline-scalar-operands                   -   Inline scalar operands into linalg generic ops
      --linalg-morph-ops                                -   Convert linalg ops between forms
        --category-to-generic                           - convert category ops e.g. `linalg.elementwise` to `linalg.generic`
        --generic-to-named                              - convert linalg.generic to equivalent named ops
        --named-to-category                             - convert named ops to category op e.g. `linalg.elementwise`
        --named-to-generic                              - convert named ops e.g. `linalg.add` to `linalg.generic`
      --linalg-specialize-generic-ops                   -   Convert generic ops back to named ops
      --loop-invariant-code-motion                      -   Hoist loop invariant instructions outside of the loop
      --loop-invariant-subset-hoisting                  -   Hoist loop invariant subset ops outside of the loop
      --lower-affine                                    -   Lower Affine operations to a combination of Arith and SCF operations
      --lower-polynomial-eval                           -   Lowers the polynomial.eval operation
        --method=<value>                                - The method used to lower polynomial.eval
    =auto                                         -   Automatically selected
    =horner                                       -   Horner's method (monomial basis)
    =ps                                           -   Paterson-Stockmeyer method (monomial basis)
    =pscheb                                       -   Paterson-Stockmeyer method (Chebyshev basis)
      --lower-unpack                                    -   Lower tensor_ext.unpack to standard MLIR
      --lwe-add-debug-port                              -   Add debug port to (R)LWE encrypted functions
        --entry-function=<string>                       - Default entry function name of entry function.
      --lwe-to-lattigo                                  -   Lower `lwe` to `lattigo` dialect.
      --lwe-to-openfhe                                  -   Lower `lwe` to `openfhe` dialect.
      --lwe-to-polynomial                               -   Lower `lwe` to `polynomial` dialect.
      --mem2reg                                         -   Promotes memory slots into values.
        --region-simplify                               - Perform control flow optimizations to the region tree
      --memref-global-replace                           -   MemrefGlobalReplacePass forwards global memrefs accessors to arithmetic values
      --mod-arith-to-arith                              -   Lower `mod_arith` to standard `arith`.
      --mod-arith-to-mac                                -   Finds consecutive ModArith mul and add operations and converts them to a Mac operation
      --new-layout-propagation                          -   Propagate ciphertext layouts through the IR
        --ciphertext-size=<int>                         - Power of two length of the ciphertexts the data is packed in.
      --one-shot-bufferize                              -   One-Shot Bufferize
        --allow-return-allocs-from-loops                - Allows returning/yielding new allocations from a loop.
        --allow-unknown-ops                             - Allows unknown (not bufferizable) ops in the input IR.
        --analysis-fuzzer-seed=<uint>                   - Test only: Analyze ops in random order with a given seed (fuzzer)
        --analysis-heuristic=<string>                   - Heuristic that control the IR traversal during analysis
        --buffer-alignment=<ulong>                      - Sets the alignment of newly allocated buffers.
        --bufferize-function-boundaries                 - Bufferize function boundaries (experimental).
        --check-parallel-regions                        - Account for parallel regions in RaW analysis.
        --copy-before-write                             - Skip the analysis. Make a buffer copy on every write.
        --dialect-filter=<string>                       - Restrict bufferization to ops from these dialects.
        --dump-alias-sets                               - Test only: Annotate tensor IR with alias sets
        --function-boundary-type-conversion=<value>     - Controls layout maps when bufferizing function signatures.
    =infer-layout-map
    =identity-layout-map
    =fully-dynamic-layout-map
        --must-infer-memory-space                       - The memory space of an memref types must always be inferred. If unset, a default memory space of 0 is used otherwise.
        --no-analysis-func-filter=<string>              - Skip analysis of functions with these symbol names.Set copyBeforeWrite to true when bufferizing them.
        --print-conflicts                               - Test only: Annotate IR with RaW conflicts. Requires test-analysis-only.
        --test-analysis-only                            - Test only: Only run inplaceability analysis and annotate IR
        --unknown-type-conversion=<value>               - Controls layout maps for non-inferrable memref types.
    =infer-layout-map
    =identity-layout-map
    =fully-dynamic-layout-map
        --use-encoding-for-memory-space                 - Use the Tensor encoding attribute for the memory space. Exclusive to the 'must-infer-memory-space' option
      --openfhe-configure-crypto-context                -   Configure the crypto context in OpenFHE
        --batch-size=<int>                              - Manually specify the batch size
        --digit-size=<int>                              - Manually specify the digit size for relinearization
        --entry-function=<string>                       - Default entry function name of entry function.
        --first-mod-size=<int>                          - Manually specify the first mod size
        --insecure                                      - Whether to use insecure parameter (defaults to false)
        --key-switching-technique-bv                    - Whether to use BV key switching technique (defaults to false)
        --level-budget-decode=<int>                     - Level budget for CKKS bootstrap decode (c2s) phase
        --level-budget-encode=<int>                     - Level budget for CKKS bootstrap encode (s2c) phase
        --max-relin-sk-deg=<int>                        - Manually specify the max number of relin sk deg
        --mul-depth=<int>                               - Manually specify the mul depth
        --num-large-digits=<int>                        - Manually specify the number of large digits for HYBRID relinearization
        --ring-dim=<int>                                - Manually specify the ring dimension (insecure is implied)
        --scaling-mod-size=<int>                        - Manually specify the scaling mod size
        --scaling-technique-fixed-manual                - Whether to use fixed manual scaling technique (defaults to false)
      --openfhe-count-add-and-key-switch                -   Count the number of add and key-switch operations in OpenFHE
      --openfhe-fast-rotation-precompute                -   Identify and apply EvalFastRotation when possible.
      --operation-balancer                              -   This pass balances addition and multiplication operations.
      --optimize-allocation-liveness                    -   This pass optimizes the liveness of temp allocations in the input function
      --optimize-relinearization                        -   Optimize placement of relinearization ops
        --allow-mixed-degree-operands                   - When true, allow ops to have mixed-degree ciphertexts as inputs, e.g., adding two ciphertexts with different key bases; this is supported by many FHE backends, like OpenFHE and Lattigo
        --use-loc-based-variable-names                  - When true, the ILP uses op source locations in variable names, which can help debug ILP model bugs.
      --ownership-based-buffer-deallocation             -   Adds all required dealloc operations for all allocations in the input program
        --private-function-dynamic-ownership            - Allows to add additional arguments to private functions to dynamically pass ownership of memrefs to callees. This can enable earlier deallocations.
      --polynomial-approximation                        -   Approximate ops by polynomials
      --polynomial-to-mod-arith                         -   Lower `polynomial` to standard MLIR dialects.
      --populate-scale-bgv                              -   Populate the scale for BGV (GHS variant) ciphertext
      --populate-scale-ckks                             -   Populate the scale for CKKS ciphertext
        --before-mul-include-first-mul                  - Modulus switching before each multiplication, including the first multiplication (default to false)
      --print-ir                                        -   Print IR on the debug stream
        --label=<string>                                - Label
      --print-op-stats                                  -   Print statistics of operations
        --json                                          - print the stats as JSON
      --promote-buffers-to-stack                        -   Promotes heap-based allocations to automatically managed stack-based allocations
        --max-alloc-size-in-bytes=<uint>                - Maximal size in bytes to promote allocations to stack.
        --max-rank-of-allocated-memref=<uint>           - Maximal memref rank to promote dynamic buffers.
      --propagate-annotation                            -   Propagate annotation from operation to subsequent operations
        --attr-name=<string>                            - The attribute name to propagate with.
        --reverse                                       - Whether to propagate in reverse
      --reconcile-unrealized-casts                      -   Simplify and eliminate unrealized conversion casts
      --remove-dead-values                              -   Remove dead values
      --remove-unused-memref                            -   Cleanup any unused memrefs
      --rotate-and-reduce                               -   Use a logarithmic number of rotations to reduce a tensor.
      --sccp                                            -   Sparse Conditional Constant Propagation
      --secret-add-debug-port                           -   Add debug port to secret-arithmetic ops
      --secret-capture-generic-ambient-scope            -   Capture the ambient scope used in a secret.generic
      --secret-distribute-generic                       -   Distribute `generic` ops through their bodies.
        --distribute-through=<string>                   - comma-separated list of ops that should be distributed through
      --secret-extract-generic-body                     -   Extract the bodies of all generic ops into functions
      --secret-forget-secrets                           -   Convert secret types to standard types
      --secret-generic-absorb-constants                 -   Copy constants into a secret.generic body
      --secret-generic-absorb-dealloc                   -   Copy deallocs of internal memrefs into a secret.generic body
      --secret-import-execution-result                  -   Annotate execution result to secret-arithmetic ops
        --file-name=<string>                            - file name of the execution result
      --secret-insert-mgmt-bfv                          -   Place BFV ciphertext management operations
      --secret-insert-mgmt-bgv                          -   Place BGV ciphertext management operations
        --after-mul                                     - Modulus switching after each multiplication (default to false)
        --before-mul-include-first-mul                  - Modulus switching before each multiplication, including the first multiplication (default to false)
      --secret-insert-mgmt-ckks                         -   Place CKKS ciphertext management operations
        --after-mul                                     - Modulus switching after each multiplication (default to false)
        --before-mul-include-first-mul                  - Modulus switching before each multiplication, including the first multiplication (default to false)
        --bootstrap-waterline=<int>                     - Waterline for insert bootstrap op
        --slot-number=<int>                             - Default number of slots use for ciphertext space.
      --secret-merge-adjacent-generics                  -   Merge two adjacent generics into a single generic
      --secret-to-bgv                                   -   Lower `secret` to `bgv` dialect.
        --poly-mod-degree=<int>                         - Default degree of the cyclotomic polynomial modulus to use for ciphertext space.
      --secret-to-cggi                                  -   Lower `secret` to `cggi` dialect.
      --secret-to-ckks                                  -   Lower `secret` to `ckks` dialect.
        --poly-mod-degree=<int>                         - Default degree of the cyclotomic polynomial modulus to use for ciphertext space.
      --secret-to-mod-arith                             -   Lower `secret` to `mod-arith` dialect.
        --log-scale=<long>                              - Log base 2 of the scale for encoding floating points as ints.
        --modulus=<long>                                - Modulus to use for the mod-arith dialect. If not specified, the pass will use the natural modulus for that integer type
      --secretize                                       -   Adds secret argument attributes to entry function
        --function=<string>                             - function to add secret annotations to
      --select-rewrite                                  -   Rewrites arith.select to a CMUX style expression
      --shape-inference                                 -   Infer shapes for shaped types
      --simplify-depthwise-conv                         -   Simplify depthwise convolution.
      --snapshot-op-locations                           -   Generate new locations from the current IR
        --filename=<string>                             - The filename to print the generated IR
        --pretty-debuginfo                              - Print pretty debug info in MLIR output
        --print-debuginfo                               - Print debug info in MLIR output
        --print-local-scope                             - Print with local scope and inline information (eliding aliases for attributes, types, and locations
        --print-op-generic                              - Print the generic op form
        --tag=<string>                                  - A tag to use when fusing the new locations with the original. If unset, the locations are replaced.
      --split-preprocessing                             -   Splits a function into a preprocessing and a main part
      --sroa                                            -   Scalar Replacement of Aggregates
      --straight-line-vectorize                         -   A vectorizer for straight line programs.
        --dialect=<string>                              - Use this to restrict the dialect whose ops should be vectorized.
      --strip-debuginfo                                 -   Strip debug info from all operations
      --symbol-dce                                      -   Eliminate dead symbols
      --symbol-privatize                                -   Mark symbols private
        --exclude=<string>                              - Comma separated list of symbols that should not be marked private
      --tensor-ext-to-tensor                            -   Lower `tensor_ext` to `tensor` dialect.
      --tensor-linalg-to-affine-loops                   -   A port of convert-linalg-to-affine-loops for loops with tensor semantics
      --topological-sort                                -   Sort regions without SSA dominance in topological order
      --tosa-to-secret-arith                            -   Lower `tosa.sigmoid` to secret arith dialects.
      --unroll-and-forward                              -   Loop unrolls and forwards stores to loads.
      --validate-noise                                  -   Validate the HE circuit against a given noise model
        --annotate-noise-bound                          - Annotate the noise bound to the IR.
        --model=<string>                                - Noise model to validate against.
      --view-op-graph                                   -   Print Graphviz visualization of an operation
        --max-label-len=<uint>                          - Limit attribute/type length to number of chars
        --print-attrs                                   - Print attributes of operations
        --print-control-flow-edges                      - Print control flow edges
        --print-data-flow-edges                         - Print data flow edges
        --print-result-types                            - Print result types of operations
      --wrap-generic                                    -   Wraps regions using secret args in secret.generic bodies
    Pass Pipelines:
      --convert-to-data-oblivious                       -   Transforms a native program to data-oblivious program
      --heir-basic-mlir-to-llvm                         -   Lower basic MLIR to LLVM
      --heir-polynomial-to-llvm                         -   Run passes to lower the polynomial dialect to LLVM
      --heir-simd-vectorizer                            -   Run scheme-agnostic passes to convert FHE programs that operate on scalar types to equivalent programs that operate on vectors and use tensor_ext.rotate
        --experimental-disable-loop-unroll              - Experimental: disable loop unroll, may break analyses (default to false)
      --heir-tosa-to-arith                              -   Run passes to lower TOSA models with stripped quant types to arithmetic
        --full-unroll                                   - Full unroll all loops.
      --math-to-polynomial-approximation                -   Approximate math operations that cannot be expressed in FHE using polynomial approximations.
      --mlir-to-bfv                                     -   Convert a func using standard MLIR dialects to FHE using BFV.
        --annotate-noise-bound                          - If true, the noise predicted by noise model is annotated in the IR.
        --bfv-mod-bits=<int>                            - The number of bits for all moduli for B/FV
        --ciphertext-degree=<int>                       - The degree of the polynomials to use for ciphertexts; equivalently, the number of messages that can be packed into a single ciphertext.
        --ckks-bootstrap-waterline=<int>                - The number of levels to keep until bootstrapping in CKKS (c.f. --secret-insert-mgmt-ckks)
        --enable-arithmetization                        - If false, skip the arithmetization pipeline and try to directly lower to RLWE scheme (default to true)
        --encryption-technique-extended                 - If true, use extended encryption technique (default to false)
        --experimental-disable-loop-unroll              - Experimental: disable loop unroll, may break analyses (default to false)
        --first-mod-bits=<int>                          - The number of bits in the first modulus for CKKS
        --modulus-switch-before-first-mul               - Modulus switching right before the first multiplication (default to false)
        --noise-model=<string>                          - Noise model to use during parameter generation, see --generate-param pass options for available models
        --plaintext-execution-result-file-name=<string> - File name to import execution result from (c.f. --secret-import-execution-result)
        --plaintext-modulus=<long>                      - Plaintext modulus for BGV scheme (default to 65537)
        --scaling-mod-bits=<int>                        - The number of bits in the scaling modulus for CKKS
        --use-public-key                                - If true, use public key encryption (default to true)
      --mlir-to-bgv                                     -   Convert a func using standard MLIR dialects to FHE using BGV.
        --annotate-noise-bound                          - If true, the noise predicted by noise model is annotated in the IR.
        --bfv-mod-bits=<int>                            - The number of bits for all moduli for B/FV
        --ciphertext-degree=<int>                       - The degree of the polynomials to use for ciphertexts; equivalently, the number of messages that can be packed into a single ciphertext.
        --ckks-bootstrap-waterline=<int>                - The number of levels to keep until bootstrapping in CKKS (c.f. --secret-insert-mgmt-ckks)
        --enable-arithmetization                        - If false, skip the arithmetization pipeline and try to directly lower to RLWE scheme (default to true)
        --encryption-technique-extended                 - If true, use extended encryption technique (default to false)
        --experimental-disable-loop-unroll              - Experimental: disable loop unroll, may break analyses (default to false)
        --first-mod-bits=<int>                          - The number of bits in the first modulus for CKKS
        --modulus-switch-before-first-mul               - Modulus switching right before the first multiplication (default to false)
        --noise-model=<string>                          - Noise model to use during parameter generation, see --generate-param pass options for available models
        --plaintext-execution-result-file-name=<string> - File name to import execution result from (c.f. --secret-import-execution-result)
        --plaintext-modulus=<long>                      - Plaintext modulus for BGV scheme (default to 65537)
        --scaling-mod-bits=<int>                        - The number of bits in the scaling modulus for CKKS
        --use-public-key                                - If true, use public key encryption (default to true)
      --mlir-to-cggi                                    -   Convert a func using standard MLIR dialects to FHE using CGGI.
        --abc-fast                                      - Run abc in fast mode.
        --data-type=<value>                             - Data type to use for arithmetization, yosys must be enabled for Boolean.
    =Bool                                         -   booleanize with Yosys
    =Integer                                      -   decompose operations into 32 bit data types
        --mode=<value>                                  - Map gates to boolean gates or lookup table gates.
    =Boolean                                      -   use boolean gates
    =LUT                                          -   use lookup tables
        --print-stats                                   - Prints statistics about the optimized circuit
        --unroll-factor=<int>                           - Unroll loops by a given factor before optimizing. A value of zero (default) prevents unrolling.
        --use-submodules                                - Extracts secret.generic bodies into submodules before optimizing. Default is true.
      --mlir-to-ckks                                    -   Convert a func using standard MLIR dialects to FHE using CKKS.
        --annotate-noise-bound                          - If true, the noise predicted by noise model is annotated in the IR.
        --bfv-mod-bits=<int>                            - The number of bits for all moduli for B/FV
        --ciphertext-degree=<int>                       - The degree of the polynomials to use for ciphertexts; equivalently, the number of messages that can be packed into a single ciphertext.
        --ckks-bootstrap-waterline=<int>                - The number of levels to keep until bootstrapping in CKKS (c.f. --secret-insert-mgmt-ckks)
        --enable-arithmetization                        - If false, skip the arithmetization pipeline and try to directly lower to RLWE scheme (default to true)
        --encryption-technique-extended                 - If true, use extended encryption technique (default to false)
        --experimental-disable-loop-unroll              - Experimental: disable loop unroll, may break analyses (default to false)
        --first-mod-bits=<int>                          - The number of bits in the first modulus for CKKS
        --modulus-switch-before-first-mul               - Modulus switching right before the first multiplication (default to false)
        --noise-model=<string>                          - Noise model to use during parameter generation, see --generate-param pass options for available models
        --plaintext-execution-result-file-name=<string> - File name to import execution result from (c.f. --secret-import-execution-result)
        --plaintext-modulus=<long>                      - Plaintext modulus for BGV scheme (default to 65537)
        --scaling-mod-bits=<int>                        - The number of bits in the scaling modulus for CKKS
        --use-public-key                                - If true, use public key encryption (default to true)
      --mlir-to-plaintext-backend                       -   Convert a func using standard MLIR dialects to plaintext backend executed in secret arithmetic
        --insert-debug-handler-calls                    - Insert function calls to an externally-defined debug function (cf. --secret-add-debug-port)
        --plaintext-modulus=<long>                      - Plaintext modulus for BGV/BFV scheme (if not specified, execute in the original integer type)
        --plaintext-size=<int>                          - The size of the plaintexts; i.e., the number of slots to use for packing.
      --mlir-to-secret-arithmetic                       -   Convert a func using standard MLIR dialects to secret dialect with arithmetic ops
        --annotate-noise-bound                          - If true, the noise predicted by noise model is annotated in the IR.
        --bfv-mod-bits=<int>                            - The number of bits for all moduli for B/FV
        --ciphertext-degree=<int>                       - The degree of the polynomials to use for ciphertexts; equivalently, the number of messages that can be packed into a single ciphertext.
        --ckks-bootstrap-waterline=<int>                - The number of levels to keep until bootstrapping in CKKS (c.f. --secret-insert-mgmt-ckks)
        --enable-arithmetization                        - If false, skip the arithmetization pipeline and try to directly lower to RLWE scheme (default to true)
        --encryption-technique-extended                 - If true, use extended encryption technique (default to false)
        --experimental-disable-loop-unroll              - Experimental: disable loop unroll, may break analyses (default to false)
        --first-mod-bits=<int>                          - The number of bits in the first modulus for CKKS
        --modulus-switch-before-first-mul               - Modulus switching right before the first multiplication (default to false)
        --noise-model=<string>                          - Noise model to use during parameter generation, see --generate-param pass options for available models
        --plaintext-execution-result-file-name=<string> - File name to import execution result from (c.f. --secret-import-execution-result)
        --plaintext-modulus=<long>                      - Plaintext modulus for BGV scheme (default to 65537)
        --scaling-mod-bits=<int>                        - The number of bits in the scaling modulus for CKKS
        --use-public-key                                - If true, use public key encryption (default to true)
      --scheme-to-fpt                                   -   Convert code expressed at FHE scheme level to Lattigo Go code.
      --scheme-to-jaxite                                -   Convert code expressed at FHE scheme level to jaxite code.
        --parallelism=<int>                             - batching size for parallelism. A value of -1 (default) is infinite parallelism
      --scheme-to-lattigo                               -   Convert code expressed at FHE scheme level to Lattigo Go code.
        --entry-function=<string>                       - Entry function
        --insert-debug-handler-calls                    - Insert function calls to an externally-defined debug function (cf. --lwe-add-debug-port)
      --scheme-to-openfhe                               -   Convert code expressed at FHE scheme level to OpenFHE C++ code.
        --entry-function=<string>                       - Entry function
        --insert-debug-handler-calls                    - Insert function calls to an externally-defined debug function (cf. --lwe-add-debug-port)
      --scheme-to-tfhe-rs                               -   Convert code expressed at FHE scheme level to tfhe-rs code.
      --torch-linalg-to-ckks                            -   Convert linalg MLIR exported from torch to CKKS
        --ciphertext-degree=<int>                       - The degree of the polynomials to use for ciphertexts; equivalently, the number of messages that can be packed into a single ciphertext.
        --ckks-bootstrap-waterline=<int>                - The number of levels to keep until bootstrapping in CKKS (c.f. --secret-insert-mgmt-ckks)
        --first-mod-bits=<int>                          - The number of bits in the first modulus for CKKS
        --scaling-mod-bits=<int>                        - The number of bits in the scaling modulus for CKKS
      --yosys-optimizer                                 -   The yosys optimizer pipeline.
        --abc-fast                                      - Run abc in fast mode.
        --mode=<value>                                  - Map gates to boolean gates or lookup table gates.
    =Boolean                                      -   use boolean gates
    =LUT                                          -   use lookup tables
        --print-stats                                   - Prints statistics about the optimized circuit
        --unroll-factor=<int>                           - Unroll loops by a given factor before optimizing. A value of zero (default) prevents unrolling.
        --use-submodules                                - Extracts secret.generic bodies into submodules before optimizing. Default is true.
  --show-dialects                                       - Print the list of registered dialects and exit
  --split-input-file[=<string>]                           - Split the input file into chunks using the given or default marker and process each chunk independently
  --verify-diagnostics                                  - Check that emitted diagnostics match expected-* lines on the corresponding line
  --verify-diagnostics=<value>                          - Check that emitted diagnostics match expected-* lines on the corresponding line
    =all                                                -   Check all diagnostics (expected, unexpected, near-misses)
    =<empty>                                            -   Check all diagnostics (expected, unexpected, near-misses)
    =only-expected                                      -   Check only expected diagnostics
  --verify-each                                         - Run the verifier after each transformation pass
  --verify-roundtrip                                    - Round-trip the IR after parsing and ensure it succeeds

Generic Options:

  --help                                                - Display available options (--help-hidden for more)
  --help-list                                           - Display list of available options (--help-list-hidden for more)
  --version                                             - Display the version of this program
heiruser@b3721724568a:~/heir$ ./bazel-bin/tools/heir-translate --help
OVERVIEW: HEIR Translation Tool
USAGE: heir-translate [options] <input file>

OPTIONS:

Color Options:

  --color                                              - Use colors in output (default=autodetect)

General options:

  --allow-unregistered-dialect                         - Allow operation with no registered dialects (discouraged: testing only!)
  Translations to perform
      --emit-function-info                                - Emit function info (helper for python frontend)
      --emit-jaxite                                       - translate the jaxite dialect to python code for jaxite
      --emit-jaxiteword                                   - translate the JaxiteWord dialect to python code for jaxiteword
      --emit-lattigo                                      - translate the lattigo dialect to GO code against the Lattigo API
      --emit-metadata                                     - emit function signature metadata for the given MLIR
      --emit-openfhe-pke                                  - translate the openfhe dialect to C++ code against the OpenFHE pke API
      --emit-openfhe-pke-header                           - Emit a header corresponding to the C++ file generated by --emit-openfhe-pke
      --emit-openfhe-pke-pybind                           - Emit a C++ file containing pybind11 bindings for the input openfhe dialect IR--emit-openfhe-pke-pybind
      --emit-simfhe                                       - translate ckks dialect to SimFHE python code
      --emit-tfhe-rust                                    - translate the tfhe_rs dialect to Rust code for tfhe-rs
      --emit-tfhe-rust-bool                               - translate the tfhe-rs-bool dialect to Rust code for boolean tfhe-rs
      --emit-tfhe-rust-bool-packed                        - translate the tfhe-rs-bool dialect to Rust code for Belfort FPGA (boolean) tfhe-rs API
      --emit-tfhe-rust-hl                                 - translate the tfhe-rs dialect to HL Rust code
      --emit-verilog                                      - translate from arithmetic to verilog
      --import-autohog                                    - Import from AutoHoG JSON to HEIR MLIR
  --error-diagnostics-only                             - Filter all non-error diagnostics (discouraged: testing only!)
  --mlir-disable-threading                             - Disable multi-threading within MLIR, overrides any further call to MLIRContext::enableMultiThreading()
  --mlir-elide-elementsattrs-if-larger=<uint>          - Elide ElementsAttrs with "..." that have more elements than the given upper limit
  --mlir-elide-resource-strings-if-larger=<uint>       - Elide printing value of resources if string is too long in chars.
  --mlir-output-format=<value>                         - Output format for timing data
    =text                                              -   display the results in text format
    =json                                              -   display the results in JSON format
  --mlir-pretty-debuginfo                              - Print pretty debug info in MLIR output
  --mlir-print-debuginfo                               - Print debug info in MLIR output
  --mlir-print-elementsattrs-with-hex-if-larger=<long> - Print DenseElementsAttrs with a hex string that have more elements than the given upper limit (use -1 to disable)
  --mlir-print-local-scope                             - Print with local scope and inline information (eliding aliases for attributes, types, and locations)
  --mlir-print-op-on-diagnostic                        - When a diagnostic is emitted on an operation, also print the operation as an attached note
  --mlir-print-skip-regions                            - Skip regions when printing ops.
  --mlir-print-stacktrace-on-diagnostic                - When a diagnostic is emitted, also print the stack trace as an attached note
  --mlir-print-unique-ssa-ids                          - Print unique SSA ID numbers for values, block arguments and naming conflicts across all regions
  --mlir-print-value-users                             - Print users of operation results and block arguments as a comment
  --mlir-timing                                        - Display execution times
  --mlir-timing-display=<value>                        - Display method for timing data
    =list                                              -   display the results in a list sorted by total time
    =tree                                              -   display the results ina with a nested tree view
  --mlir-use-nameloc-as-prefix                         - Print SSA IDs using NameLocs as prefixes
  --no-implicit-module                                 - Disable the parsing of an implicit top-level module op
  -o <filename>                                        - Output filename
  --openfhe-include-type=<value>                       - The type of imports to use for OpenFHE
    =install-relative                                  -   Emit OpenFHE with install-relative import paths (default for user-facing code)
    =source-relative                                   -   Emit OpenFHE with source-relative import paths (default for HEIR-internal development)
    =embedded                                          -   Emit OpenFHE with embedded import paths (default for code to be included in OpenFHE source files)
  --output-split-marker=<string>                       - Split marker to use for merging the ouput
  --package-name=<string>                              - The name to use for the package declaration in the generated golang file.
  --pybind-header-include=<string>                     - The HEIR-generated header to include for the pybind11 bindings
  --pybind-module-name=<string>                        - The name of the generated python module (must match the .so file)
  --skip-vector-resizing                               - Skip resizing vectors to ringdimension/2 when emitting OpenFHE PKE code, i.e., assume the dimensions in the input IR are correct already.
  --split-input-file[=<string>]                          - Split the input file into chunks using the given or default marker and process each chunk independently
  --use-levels                                         - Use levels
  --verify-diagnostics                                 - Check that emitted diagnostics match expected-* lines on the corresponding line
  --verify-diagnostics=<value>                         - Check that emitted diagnostics match expected-* lines on the corresponding line
    =all                                               -   Check all diagnostics (expected, unexpected, near-misses)
    =<empty>                                           -   Check all diagnostics (expected, unexpected, near-misses)
    =only-expected                                     -   Check only expected diagnostics
  --weights-file=<string>                              - Emit all dense elements attributes to this binary file

Generic Options:

  --help                                               - Display available options (--help-hidden for more)
  --help-list                                          - Display list of available options (--help-list-hidden for more)
  --version                                            - Display the version of this program

will run tests and then squash the commits.

ai-mannamalai · 2025-09-18T17:56:19Z

@mabhay3420 - post successful build of the docker image and running the tests in the repository; otherwise I don't see the value of this change taking it on your (well meaning even) word.

mabhay3420 · 2025-09-18T18:00:01Z

@mabhay3420 - post successful build of the docker image and running the tests in the repository; otherwise I don't see the value of this change taking it on your (well meaning even) word.

agreed. is this run as part of ci or are you planning to test on your end?

ai-mannamalai · 2025-09-18T18:01:20Z

No bandwidth to test new docker image; what exists works for me. From: Abhay mishra ***@***.***> Date: Thursday, September 18, 2025 at 11:00 AM To: google/heir ***@***.***> Cc: Muthu Annamalai ***@***.***>, Mention ***@***.***> Subject: [EXTERNAL] Re: [google/heir] Fix issues with Docker setup (PR #2235) [https://avatars.githubusercontent.com/u/64887986?s=20&v=4]mabhay3420 left a comment (google/heir#2235)<#2235 (comment)> @mabhay3420<https://github.com/mabhay3420> - post successful build of the docker image and running the tests in the repository; otherwise I don't see the value of this change taking it on your (well meaning even) word. agreed. is this run as part of ci or are you planning to test on your end? — Reply to this email directly, view it on GitHub<#2235 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/APGLTDBSP475YSJUDTJLNPL3TLXLPAVCNFSM6AAAAACG4OEKR2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTGMBYHA2DENBXGY>. You are receiving this because you were mentioned.

mabhay3420 · 2025-09-18T18:03:15Z

And i just saw a test failure.

FAIL: //tests/Examples/jaxite:fully_connected (Exit 1) (see /home/heiruser/.cache/bazel/_bazel_heiruser/a864f382268f12449bd3daa4685cd0c1/execroot/_main/bazel-out/k8-dbg/testlogs/tests/Examples/jaxite/fully_connected/test.log)
INFO: From Testing //tests/Examples/jaxite:fully_connected:
==================== Test output for //tests/Examples/jaxite:fully_connected:
Traceback (most recent call last):
  File "/home/heiruser/.cache/bazel/_bazel_heiruser/a864f382268f12449bd3daa4685cd0c1/sandbox/processwrapper-sandbox/780/execroot/_main/bazel-out/k8-dbg/bin/tests/Examples/jaxite/fully_connected.runfiles/_main/tests/Examples/jaxite/fully_connected_test.py", line 4, in <module>
    from tests.Examples.jaxite import fully_connected_lib
  File "/home/heiruser/.cache/bazel/_bazel_heiruser/a864f382268f12449bd3daa4685cd0c1/sandbox/processwrapper-sandbox/780/execroot/_main/bazel-out/k8-dbg/bin/tests/Examples/jaxite/fully_connected.runfiles/_main/tests/Examples/jaxite/fully_connected_lib.py", line 6, in <module>
    from jaxite.jaxite_bool import jaxite_bool
  File "/home/heiruser/.cache/bazel/_bazel_heiruser/a864f382268f12449bd3daa4685cd0c1/sandbox/processwrapper-sandbox/780/execroot/_main/bazel-out/k8-dbg/bin/tests/Examples/jaxite/fully_connected.runfiles/rules_python++pip+heir_pip_deps_311_jaxite/site-packages/jaxite/jaxite_bool/jaxite_bool.py", line 6, in <module>
    import jax
  File "/home/heiruser/.cache/bazel/_bazel_heiruser/a864f382268f12449bd3daa4685cd0c1/sandbox/processwrapper-sandbox/780/execroot/_main/bazel-out/k8-dbg/bin/tests/Examples/jaxite/fully_connected.runfiles/rules_python++pip+heir_pip_deps_311_jax/site-packages/jax/__init__.py", line 25, in <module>
    from jax._src.cloud_tpu_init import cloud_tpu_init as _cloud_tpu_init
  File "/home/heiruser/.cache/bazel/_bazel_heiruser/a864f382268f12449bd3daa4685cd0c1/sandbox/processwrapper-sandbox/780/execroot/_main/bazel-out/k8-dbg/bin/tests/Examples/jaxite/fully_connected.runfiles/rules_python++pip+heir_pip_deps_311_jax/site-packages/jax/_src/cloud_tpu_init.py", line 17, in <module>
    from jax._src import config
  File "/home/heiruser/.cache/bazel/_bazel_heiruser/a864f382268f12449bd3daa4685cd0c1/sandbox/processwrapper-sandbox/780/execroot/_main/bazel-out/k8-dbg/bin/tests/Examples/jaxite/fully_connected.runfiles/rules_python++pip+heir_pip_deps_311_jax/site-packages/jax/_src/config.py", line 26, in <module>
    from jax._src.lib import guard_lib
  File "/home/heiruser/.cache/bazel/_bazel_heiruser/a864f382268f12449bd3daa4685cd0c1/sandbox/processwrapper-sandbox/780/execroot/_main/bazel-out/k8-dbg/bin/tests/Examples/jaxite/fully_connected.runfiles/rules_python++pip+heir_pip_deps_311_jax/site-packages/jax/_src/lib/__init__.py", line 84, in <module>
    cpu_feature_guard.check_cpu_features()
RuntimeError: This version of jaxlib was built using AVX instructions, which your CPU and/or operating system do not support. You may be able work around this issue by building jaxlib from source.
================================================================================
[12,611 / 14,254] 19 / 575 tests, 1 failed; 10 actions running; last test: //tests/Examples/openfhe/bgv/binops:binops_test

Specifically:

RuntimeError: This version of jaxlib was built using AVX instructions, which your CPU and/or operating system do not support. You may be able work around this issue by building jaxlib from source.

is this expected? I am on rosetta emulated amd64:

heiruser@b3721724568a:~/heir$ uname -a
Linux b3721724568a 6.15.11-orbstack-00539-g9885ebd8e3f4 #1 SMP PREEMPT Fri Aug 22 08:24:56 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

mabhay3420 · 2025-09-18T18:06:02Z

I am going to convert this to draft and rerequest review if i am able to resolve the test failure. Thank you for your time.

ai-mannamalai · 2025-09-18T18:07:00Z

Potentially OK to ignore on Docker From: Abhay mishra ***@***.***> Date: Thursday, September 18, 2025 at 11:03 AM To: google/heir ***@***.***> Cc: Muthu Annamalai ***@***.***>, Mention ***@***.***> Subject: [EXTERNAL] Re: [google/heir] Fix issues with Docker setup (PR #2235) [https://avatars.githubusercontent.com/u/64887986?s=20&v=4]mabhay3420 left a comment (google/heir#2235)<#2235 (comment)> And i just saw a test failure. FAIL: //tests/Examples/jaxite:fully_connected (Exit 1) (see /home/heiruser/.cache/bazel/_bazel_heiruser/a864f382268f12449bd3daa4685cd0c1/execroot/_main/bazel-out/k8-dbg/testlogs/tests/Examples/jaxite/fully_connected/test.log) INFO: From Testing //tests/Examples/jaxite:fully_connected: ==================== Test output for //tests/Examples/jaxite:fully_connected: Traceback (most recent call last): File "/home/heiruser/.cache/bazel/_bazel_heiruser/a864f382268f12449bd3daa4685cd0c1/sandbox/processwrapper-sandbox/780/execroot/_main/bazel-out/k8-dbg/bin/tests/Examples/jaxite/fully_connected.runfiles/_main/tests/Examples/jaxite/fully_connected_test.py", line 4, in <module> from tests.Examples.jaxite import fully_connected_lib File "/home/heiruser/.cache/bazel/_bazel_heiruser/a864f382268f12449bd3daa4685cd0c1/sandbox/processwrapper-sandbox/780/execroot/_main/bazel-out/k8-dbg/bin/tests/Examples/jaxite/fully_connected.runfiles/_main/tests/Examples/jaxite/fully_connected_lib.py", line 6, in <module> from jaxite.jaxite_bool import jaxite_bool File "/home/heiruser/.cache/bazel/_bazel_heiruser/a864f382268f12449bd3daa4685cd0c1/sandbox/processwrapper-sandbox/780/execroot/_main/bazel-out/k8-dbg/bin/tests/Examples/jaxite/fully_connected.runfiles/rules_python++pip+heir_pip_deps_311_jaxite/site-packages/jaxite/jaxite_bool/jaxite_bool.py", line 6, in <module> import jax File "/home/heiruser/.cache/bazel/_bazel_heiruser/a864f382268f12449bd3daa4685cd0c1/sandbox/processwrapper-sandbox/780/execroot/_main/bazel-out/k8-dbg/bin/tests/Examples/jaxite/fully_connected.runfiles/rules_python++pip+heir_pip_deps_311_jax/site-packages/jax/__init__.py", line 25, in <module> from jax._src.cloud_tpu_init import cloud_tpu_init as _cloud_tpu_init File "/home/heiruser/.cache/bazel/_bazel_heiruser/a864f382268f12449bd3daa4685cd0c1/sandbox/processwrapper-sandbox/780/execroot/_main/bazel-out/k8-dbg/bin/tests/Examples/jaxite/fully_connected.runfiles/rules_python++pip+heir_pip_deps_311_jax/site-packages/jax/_src/cloud_tpu_init.py", line 17, in <module> from jax._src import config File "/home/heiruser/.cache/bazel/_bazel_heiruser/a864f382268f12449bd3daa4685cd0c1/sandbox/processwrapper-sandbox/780/execroot/_main/bazel-out/k8-dbg/bin/tests/Examples/jaxite/fully_connected.runfiles/rules_python++pip+heir_pip_deps_311_jax/site-packages/jax/_src/config.py", line 26, in <module> from jax._src.lib import guard_lib File "/home/heiruser/.cache/bazel/_bazel_heiruser/a864f382268f12449bd3daa4685cd0c1/sandbox/processwrapper-sandbox/780/execroot/_main/bazel-out/k8-dbg/bin/tests/Examples/jaxite/fully_connected.runfiles/rules_python++pip+heir_pip_deps_311_jax/site-packages/jax/_src/lib/__init__.py", line 84, in <module> cpu_feature_guard.check_cpu_features() RuntimeError: This version of jaxlib was built using AVX instructions, which your CPU and/or operating system do not support. You may be able work around this issue by building jaxlib from source. ================================================================================ [12,611 / 14,254] 19 / 575 tests, 1 failed; 10 actions running; last test: //tests/Examples/openfhe/bgv/binops:binops_test Specifically: RuntimeError: This version of jaxlib was built using AVX instructions, which your CPU and/or operating system do not support. You may be able work around this issue by building jaxlib from source. is this expected? I am on rosetta emulated amd64: ***@***.***:~/heir$ uname -a Linux b3721724568a 6.15.11-orbstack-00539-g9885ebd8e3f4 #1 SMP PREEMPT Fri Aug 22 08:24:56 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux — Reply to this email directly, view it on GitHub<#2235 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/APGLTDG6C2FATIRGFZQFR5L3TLXXRAVCNFSM6AAAAACG4OEKR2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTGMBYHA3DAMZZGI>. You are receiving this because you were mentioned.

j2kun · 2025-09-18T18:11:42Z

Feel free to add a GH workflow for testing the dockerfile. However, please don't make it run on every commit/PR. I consider this Dockerfile best-effort (GH actions is already running in docker, this is a dev tool). Feel free to have it run daily + on PRs that touch the docker directory.

jaxite

Agreed that for this dev-environment it's probably OK to ignore the incompatibility of some hardware backends. The jaxlib dependency is not part of HEIR so I can't ensure it works on every platform.

mabhay3420 · 2025-09-18T18:36:41Z

@j2kun will need maintainer approval for running the new workflow.
P.S. I'll avoid tagging again so I don't break your flow state, but I may need to iterate on the newly added workflow a few times.

j2kun · 2025-09-18T18:38:45Z

Another thing you can do to iterate faster is open the same PR against your fork's main branch. Then the actions should run without needing my approval

mabhay3420 · 2025-09-18T18:43:19Z

thought about it already - but i have temporary payment issues with my github so can't run actions 😢

improve dev experience 1. Mount project direcotry for persistent repo. 2. Add helper script to automate building image and running container with right commands 3. Improve documentation 4. add dockerfile test workflow

mabhay3420 · 2025-09-18T20:19:52Z

the run was successful: https://github.com/google/heir/actions/runs/17837817443
also squashed the commits.

mabhay3420 · 2025-09-19T13:48:07Z

@j2kun @ai-mannamalai ready for review.

j2kun

LGTM!

mabhay3420 changed the title ~~Fix issues with Docker setup( Issue https://github.com/google/heir/issues/2234)~~ Fix issues with Docker setup Sep 18, 2025

mabhay3420 force-pushed the docker_setup branch from 2e87b36 to 456689a Compare September 18, 2025 17:57

mabhay3420 marked this pull request as draft September 18, 2025 18:06

mabhay3420 marked this pull request as ready for review September 18, 2025 18:24

Fix libomp-dev and clang related issue in docker environment and

cd6b1f5

improve dev experience 1. Mount project direcotry for persistent repo. 2. Add helper script to automate building image and running container with right commands 3. Improve documentation 4. add dockerfile test workflow

mabhay3420 force-pushed the docker_setup branch from 30677b7 to cd6b1f5 Compare September 18, 2025 20:14

j2kun approved these changes Sep 19, 2025

View reviewed changes

j2kun added the pull_ready Indicates whether a PR is ready to pull. The copybara worker will import for internal testing label Sep 19, 2025

copybara-service bot merged commit 1a1e7c5 into google:main Sep 22, 2025
3 checks passed

Conversation

mabhay3420 commented Sep 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

google-cla bot commented Sep 18, 2025

Uh oh!

j2kun commented Sep 18, 2025

Uh oh!

j2kun commented Sep 18, 2025

Uh oh!

mabhay3420 commented Sep 18, 2025

Uh oh!

ai-mannamalai commented Sep 18, 2025

Uh oh!

mabhay3420 commented Sep 18, 2025

Uh oh!

ai-mannamalai commented Sep 18, 2025 via email

Uh oh!

mabhay3420 commented Sep 18, 2025

Uh oh!

mabhay3420 commented Sep 18, 2025

Uh oh!

ai-mannamalai commented Sep 18, 2025 via email

Uh oh!

j2kun commented Sep 18, 2025

Uh oh!

mabhay3420 commented Sep 18, 2025

Uh oh!

j2kun commented Sep 18, 2025

Uh oh!

mabhay3420 commented Sep 18, 2025

Uh oh!

mabhay3420 commented Sep 18, 2025

Uh oh!

mabhay3420 commented Sep 19, 2025

Uh oh!

j2kun left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mabhay3420 commented Sep 18, 2025 •

edited

Loading