Fix AI-generated code that breaks on your internal APIs

TL;DR

AI-generated code still breaks on internal APIs — even when the model sees the full API documentation.

PriCoder makes your API calls actually work.

→ Correct API calls
→ Fewer runtime errors
→ Code that actually runs

Try it in your IDE

See broken API calls fixed in your own codebase.

Install the plugin
Open your project
Ask AI to modify or generate code using your internal APIs

Install Enia Code plugin
View docs

Where it breaks

Looks correct. Still wrong.

The model sees the API — but fails to follow required usage patterns:

❌ LLM output (with API access)

import ndonnx

def safe_ratio(x, y):
    result = ndonnx.divide(x, y)              # missing asarray
    return ndonnx.where(y == 0, 0.0, result)      # incorrect API usage

✅ correct usage

import ndonnx

def safe_ratio(x, y, fallback=0.0):
    x = ndonnx.asarray(x)
    y = ndonnx.asarray(y)

    quotient = ndonnx.divide(x, y)
    return ndonnx.where(y == 0, fallback, quotient)

The model sees the APIs — but still fails to invoke them correctly, even with full API specifications.

What PriCoder does

PriCoder trains models on how APIs are actually used in real code.

learns API invocation patterns, not just documentation
improves multi-step API usage
reduces incorrect or missing API calls in real code

Why it works

Most approaches (RAG):

retrieve API documentation
rely on models to use it correctly

PriCoder:

trains on how APIs are actually invoked
learns usage patterns from synthesized execution-validated data
improves API usage, not just API awareness

Results

Even with full API documentation, models still fail on API-heavy code generation.

Baseline performance is low — pass@1 only improves from ~8% to ~13% with oracle API specs.

PriCoder significantly improves this:

+20%+ absolute pass@1 improvement
higher success rates in multi-call workflows
fewer runtime execution failures

These gains come with negligible impact on general code generation. (See paper for full evaluation details)

Paper

To See is Not to Master: Teaching LLMs to Use Private Libraries for Code Generation

Notes

PriCoder focuses on a core gap in current AI coding systems:

Models can see APIs, but don’t know how to use them.

This project addresses that gap.

Feedback and discussions are welcome.

Local Pipeline

Run the main PriCoder pipeline:

```bash
cd data_generation
cd ../api_extract
cd ../benchmark_generation
cd ../infer_and_eval

Optional steps:

pypi_crawling/: collect and filter package/API sources
data/: contains reusable datasets (documents and benchmarks)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fix AI-generated code that breaks on your internal APIs

TL;DR

Try it in your IDE

Where it breaks

❌ LLM output (with API access)

✅ correct usage

What PriCoder does

Why it works

PriCoder:

Results

Paper

Notes

Local Pipeline

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
api_extract		api_extract
benchmark_generation		benchmark_generation
data		data
data_generation		data_generation
infer_and_eval		infer_and_eval
pypi_crawling		pypi_crawling
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Fix AI-generated code that breaks on your internal APIs

TL;DR

Try it in your IDE

Where it breaks

❌ LLM output (with API access)

✅ correct usage

What PriCoder does

Why it works

PriCoder:

Results

Paper

Notes

Local Pipeline

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages