Performant double precision GPU's (such as V100 or A100) are very expensive and not available to most scientific communities. In my opinion we should support Float32 GPU optimization.
Currently, there is convert_data function to prepare data for different backends but does not have option to prepare data for 32bit optimization (despite ExaCore() having support for it). Running AC OPF example from documentation with T=Float32 setting and MadNLP with CuCholeskySolver results in restoration failure in second iteration.
Additionally, convert_data is not documented function. If possible, it should be documented and explained its usage in provided example codes.