Skip to content

[BUG]: Compiling PTX with undefined extern succeeds and produces empty kernel #920

@gmarkall

Description

@gmarkall

Is this a duplicate?

Type of Bug

Silent Failure

Component

cuda.core

Describe the bug

Attempting to compile PTX with an undefined extern function seems to succeed and produce a cubin with an empty kernel. I am doing this like:

from cuda.core.experimental import Program, ProgramOptions

options = ProgramOptions(arch="sm_75")
program = Program(code, code_type="ptx", options=options)
cubin = program.compile("cubin")

The same PTX compiled with ptxas results in a failure to compile:

$ ptxas -arch sm_75 add_float16.ptx 
ptxas fatal   : Unresolved extern function '_ZplRK6__halfS1__1'

Am I using the Program.compile() interface incorrectly?

How to Reproduce

Run the attached file:

python cudapy.py

This will produce a cubin with an empty kernel, and show ptxas raises an error for the same PTX.

cudapy.py

Expected behavior

An exception or some other error raised by Program.compile()

Operating System

Ubuntu Linux

nvidia-smi output

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions