Skip to content

Development Practices and Coding Standards

Alexander Hein-Heifetz edited this page Apr 4, 2026 · 2 revisions

Development Practices & Standards


Tech Stack at a Glance

Area Technology
Language Kotlin (preferred over Java)
Framework Spring Boot / Spring Framework
Build Tool Apache Maven
Database Neo4j (graph database)
LLM Provider OpenAI (GPT-4o / GPT-4o-mini) — switchable via ModelProvider
Scripting Python (client scripts, tooling)
Templating Jinja2 (LLM prompt templates)
Container Docker (Neo4j via Docker Compose)
API Docs Swagger / OpenAPI
CI/CD GitHub Actions + Dependabot
Quality SonarQube / Jacoco
IDE IntelliJ IDEA

General Development Practices

AI-Assisted Development

  • Use AI tools wherever possible: GitHub Copilot and Claude are the preferred tools.
  • Always closely review AI-generated code suggestions for correctness and IP concerns.

Technology Choices

  • Favor mainstream, well-supported technologies.
  • Do not introduce new technologies without clear justification.
  • Example: Neo4j is already used for the knowledge graph — no second database should be added without a strong reason.

Code Style

Kotlin Guidelines

  • Prefer Kotlin over Java for all new code.
  • Always use named parameters in Kotlin function calls for readability.
  • Use Kotlin's ? nullability instead of Java Optional.
  • Follow Spring naming conventions — consistency outweighs minimizing name length.
  • Names should be descriptive enough to make code self-explanatory.

Implementation class naming — think before reaching for Default, Simple, or Impl:

Prefix/Suffix When to use
SimpleX Deliberately basic/minimal implementation; more sophisticated variants may follow
DefaultX Standard out-of-the-box implementation; alternatives exist or may exist
XImpl Sole implementation, not expecting multiples; often private (e.g. behind a companion object or from deserialization)

Subclasses and implementations should contain the name of the supertype:

// Correct
class DefaultUserService : UserService

// Incorrect
class DefaultUsers : UserService

If none of Simple, Default, or Impl fit, prefer a descriptive name that reflects what the class actually does (e.g. CachingUserService).

Other conventions:

  • infoString is the preferred name for a method returning human-readable information about an object. Implement the HasInfoString interface.

General Code Quality

  • Emphasize readability and maintainability over cleverness.
  • Comment anything non-obvious. Use descriptive names to reduce the need for trivial comments.
  • If the obvious approach does not work, always comment why — this saves future developers time.
  • Enclose strings that may contain whitespace in log messages in single quotes:
log.info("Processing entity '${entity.name}'")
  • Use @Schema and related annotations on all types exposed via REST for accurate Swagger/OpenAPI documentation.
  • Event types passed over WebSockets must end with Event (required for TypeScript generator).

Neo4j / Cypher

  • Externalize all Cypher queries to src/main/resources/cypher — do not inline them in code.
  • Use Spring Data Neo4j 6 with care: it has no second-level cache and deletes/reinserts entire subgraphs on save.
  • Treat SDN like an ORM only with full awareness of its performance characteristics.

Testing

Test Types

Suffix Description
*IntegrationTest Spring integration test. Automated, runs under mvn test. Requires Docker (Neo4j).
*IT Requires real infrastructure (e.g., a live LLM). Not automated — run manually for exploration.

Test Conventions

  • Use @Nested JUnit Jupiter tests to group related test cases within a class.
  • Write test method names in natural language describing the scenario:
fun `should return true when the user is an admin`()
  • Use mockk for mocking — it is the Kotlin-idiomatic mocking library.
  • Integration tests mock the layer immediately below them (e.g., web controllers mock graph building).
  • Avoid code duplication in tests where possible via fixtures and utility functions, but do not be overly strict about it.
  • Never make real LLM API calls in tests. All LLM interactions must be mocked. If you find yourself reaching for a real ChatModel in a test, stop — mock it with mockk instead. Reasons:
    • Accessibility — this is an open source project. Community contributors should not need a paid API key just to run the test suite. Real LLM calls are a contribution barrier.
    • Cost — real calls cost money. With many contributors and frequent CI runs, this adds up quickly.
    • Non-determinism — LLM responses vary between calls, making assertions brittle and flaky tests hard to diagnose.
    • Latency — real calls can take several seconds each, making the full suite painfully slow.
    • Rate limits — heavy CI usage can hit API rate limits, causing random failures unrelated to code changes.
    • Offline development — contributors should be able to work and run tests without an internet connection.

Dependencies

  • Favor mainstream choices for all libraries.
  • Prefer Spring or Spring-recommended libraries over third-party alternatives.
  • Use the latest GA version of all dependencies unless there is a specific reason not to.
  • Dependabot is enabled — keep an eye on automated dependency PRs.
  • During active development, Spring AI snapshots may be used; shift to GA as soon as available.

LLM Integration

ModelProvider Abstraction

Never use a Spring AI ChatModel or EmbeddingModel directly. Always go through the ModelProvider interface:

val model = modelProvider.getLlm("best")
  • LLMs are mapped to roles (e.g., best, cheapest) via application properties.
  • Role mapping is simpler and more predictable than resolving by quality or cost.
  • Model configuration lives in @Configuration classes under the config directory.

Prompts

  • All LLM prompts are Jinja2 templates under src/main/resources/prompts.
  • Always escape potentially problematic user input with the esc filter:
{{ text|esc }}
  • Standard template variables: text (input text), formatInstructions (from Spring AI StructuredOutputConverter).
  • Experiment with prompts in the OpenAI Playground before embedding them in code.
  • Polish existing prompts by copying them from logs/prompts.log into the Playground UI.

Logging

  • Do not put logging configuration in application.properties — use logback-spring.xml.
  • General output goes to console. Focused logs go to the logs/ directory.
File Contents
logs/cypher.log All Cypher queries executed
logs/prompts.log Prompts sent to and responses from LLMs
logs/security.log Security-related events

What to Log

  • Keep logs at a consistent level of detail for a given log level. Extra detail belongs in DEBUG.
  • Don't spam logs. Remove debug log messages unless they have ongoing value.

Where to Log It

Use well-known named loggers where appropriate:

  • PROMPT_LOGGER — for exchanges with LLMs
  • CYPHER_LOGGER — for Neo4j queries
  • Otherwise, use a logger appropriate for the class.

Obtain a logger using the logger() method unless efficiency is a concern (e.g. inside a nested loop). If you want to avoid the stack examination cost, declare logger as a field on the class.

Always get the logger by .java class reference to avoid issues with Spring CGLIB proxies:

// Correct
private val logger = LoggerFactory.getLogger(CypherRagQueryExecutor::class.java)

// Incorrect — may break under Spring proxying
private val logger = LoggerFactory.getLogger(javaClass)

Only use javaClass if you are certain Spring will not proxy the object and inheritance may be involved (e.g. a protected logger field).

How to Log It

Always use {} placeholders, never string interpolation. This is more efficient and enables lazy evaluation:

// Correct
logger.info("The value is {}", value)

// Incorrect
logger.info("The value is $value")

Making Logs Entertaining

Log messages should remain clear while making the world a more entertaining place — think funny airline safety videos.

Draw inspiration from: The Big Lebowski, Peep Show, Sherlock Holmes, Silicon Valley, The League of Gentlemen, and current affairs.

A few approved quotes ready for use:

  • "Yeah, well, you know, that's just like, uh, your opinion, man."
  • "What in god's holy name are you blathering about?"
  • "Sometimes you eat the bear, and sometimes, well, he eats you."
  • "That rug really tied the room together."
  • "This is a very complicated case, Maude. You know, a lotta ins, a lotta outs, a lotta what-have-yous."
  • "Is this your homework?"
  • "This is a local shop for local people."

Pull Request Conventions

  • PRs should do one thing. Keep scope focused.
  • Reference related issues from the issue tracker in the PR description.
  • All PRs must pass the GitHub Actions CI build (mvn test).
  • Review gen AI code suggestions for correctness and IP before merging.

API & Server Conventions

Endpoint prefix Access
api/v1/* Programmatic. Requires API key (X-API-KEY header).
api/internal/* UI access. Secured via OAuth. Not for remote clients.
/dev/* Dev profile only. No API key required. For diagnostics and client development.
  • Swagger / OpenAPI docs: http://localhost:8080/swagger-ui/index.html#/
  • WebSocket support uses the STOMP sub-protocol.
  • TypeScript interfaces are generated at target/typescript/embabel-rag.ts via mvn install.

Code Coverage & Quality

  • Code coverage is computed with Jacoco.
  • View local report at target/site/jacoco/index.html after running tests.
  • SonarQube reports are available on the project dashboard.
  • Quality gate must pass on SonarCloud before merging.

Clone this wiki locally