-
Notifications
You must be signed in to change notification settings - Fork 362
Development Practices and Coding Standards
| Area | Technology |
|---|---|
| Language | Kotlin (preferred over Java) |
| Framework | Spring Boot / Spring Framework |
| Build Tool | Apache Maven |
| Database | Neo4j (graph database) |
| LLM Provider | OpenAI (GPT-4o / GPT-4o-mini) — switchable via ModelProvider
|
| Scripting | Python (client scripts, tooling) |
| Templating | Jinja2 (LLM prompt templates) |
| Container | Docker (Neo4j via Docker Compose) |
| API Docs | Swagger / OpenAPI |
| CI/CD | GitHub Actions + Dependabot |
| Quality | SonarQube / Jacoco |
| IDE | IntelliJ IDEA |
- Use AI tools wherever possible: GitHub Copilot and Claude are the preferred tools.
- Always closely review AI-generated code suggestions for correctness and IP concerns.
- Favor mainstream, well-supported technologies.
- Do not introduce new technologies without clear justification.
- Example: Neo4j is already used for the knowledge graph — no second database should be added without a strong reason.
- Prefer Kotlin over Java for all new code.
- Always use named parameters in Kotlin function calls for readability.
- Use Kotlin's
?nullability instead of JavaOptional. - Follow Spring naming conventions — consistency outweighs minimizing name length.
- Names should be descriptive enough to make code self-explanatory.
Implementation class naming — think before reaching for Default, Simple, or Impl:
| Prefix/Suffix | When to use |
|---|---|
SimpleX |
Deliberately basic/minimal implementation; more sophisticated variants may follow |
DefaultX |
Standard out-of-the-box implementation; alternatives exist or may exist |
XImpl |
Sole implementation, not expecting multiples; often private (e.g. behind a companion object or from deserialization) |
Subclasses and implementations should contain the name of the supertype:
// Correct
class DefaultUserService : UserService
// Incorrect
class DefaultUsers : UserServiceIf none of Simple, Default, or Impl fit, prefer a descriptive name that reflects what the class actually does (e.g. CachingUserService).
Other conventions:
-
infoStringis the preferred name for a method returning human-readable information about an object. Implement theHasInfoStringinterface.
- Emphasize readability and maintainability over cleverness.
- Comment anything non-obvious. Use descriptive names to reduce the need for trivial comments.
- If the obvious approach does not work, always comment why — this saves future developers time.
- Enclose strings that may contain whitespace in log messages in single quotes:
log.info("Processing entity '${entity.name}'")- Use
@Schemaand related annotations on all types exposed via REST for accurate Swagger/OpenAPI documentation. - Event types passed over WebSockets must end with
Event(required for TypeScript generator).
- Externalize all Cypher queries to
src/main/resources/cypher— do not inline them in code. - Use Spring Data Neo4j 6 with care: it has no second-level cache and deletes/reinserts entire subgraphs on save.
- Treat SDN like an ORM only with full awareness of its performance characteristics.
| Suffix | Description |
|---|---|
*IntegrationTest |
Spring integration test. Automated, runs under mvn test. Requires Docker (Neo4j). |
*IT |
Requires real infrastructure (e.g., a live LLM). Not automated — run manually for exploration. |
- Use
@NestedJUnit Jupiter tests to group related test cases within a class. - Write test method names in natural language describing the scenario:
fun `should return true when the user is an admin`()- Use mockk for mocking — it is the Kotlin-idiomatic mocking library.
- Integration tests mock the layer immediately below them (e.g., web controllers mock graph building).
- Avoid code duplication in tests where possible via fixtures and utility functions, but do not be overly strict about it.
-
Never make real LLM API calls in tests. All LLM interactions must be mocked. If you find yourself reaching for a real
ChatModelin a test, stop — mock it with mockk instead. Reasons:- Accessibility — this is an open source project. Community contributors should not need a paid API key just to run the test suite. Real LLM calls are a contribution barrier.
- Cost — real calls cost money. With many contributors and frequent CI runs, this adds up quickly.
- Non-determinism — LLM responses vary between calls, making assertions brittle and flaky tests hard to diagnose.
- Latency — real calls can take several seconds each, making the full suite painfully slow.
- Rate limits — heavy CI usage can hit API rate limits, causing random failures unrelated to code changes.
- Offline development — contributors should be able to work and run tests without an internet connection.
- Favor mainstream choices for all libraries.
- Prefer Spring or Spring-recommended libraries over third-party alternatives.
- Use the latest GA version of all dependencies unless there is a specific reason not to.
- Dependabot is enabled — keep an eye on automated dependency PRs.
- During active development, Spring AI snapshots may be used; shift to GA as soon as available.
Never use a Spring AI ChatModel or EmbeddingModel directly. Always go through the ModelProvider interface:
val model = modelProvider.getLlm("best")- LLMs are mapped to roles (e.g.,
best,cheapest) via application properties. - Role mapping is simpler and more predictable than resolving by quality or cost.
- Model configuration lives in
@Configurationclasses under theconfigdirectory.
- All LLM prompts are Jinja2 templates under
src/main/resources/prompts. - Always escape potentially problematic user input with the
escfilter:
{{ text|esc }}- Standard template variables:
text(input text),formatInstructions(from Spring AIStructuredOutputConverter). - Experiment with prompts in the OpenAI Playground before embedding them in code.
- Polish existing prompts by copying them from
logs/prompts.loginto the Playground UI.
- Do not put logging configuration in
application.properties— uselogback-spring.xml. - General output goes to console. Focused logs go to the
logs/directory.
| File | Contents |
|---|---|
logs/cypher.log |
All Cypher queries executed |
logs/prompts.log |
Prompts sent to and responses from LLMs |
logs/security.log |
Security-related events |
- Keep logs at a consistent level of detail for a given log level. Extra detail belongs in
DEBUG. - Don't spam logs. Remove debug log messages unless they have ongoing value.
Use well-known named loggers where appropriate:
-
PROMPT_LOGGER— for exchanges with LLMs -
CYPHER_LOGGER— for Neo4j queries - Otherwise, use a logger appropriate for the class.
Obtain a logger using the logger() method unless efficiency is a concern (e.g. inside a nested loop).
If you want to avoid the stack examination cost, declare logger as a field on the class.
Always get the logger by .java class reference to avoid issues with Spring CGLIB proxies:
// Correct
private val logger = LoggerFactory.getLogger(CypherRagQueryExecutor::class.java)
// Incorrect — may break under Spring proxying
private val logger = LoggerFactory.getLogger(javaClass)Only use javaClass if you are certain Spring will not proxy the object and inheritance may be involved (e.g. a protected logger field).
Always use {} placeholders, never string interpolation. This is more efficient and enables lazy evaluation:
// Correct
logger.info("The value is {}", value)
// Incorrect
logger.info("The value is $value")Log messages should remain clear while making the world a more entertaining place — think funny airline safety videos.
Draw inspiration from: The Big Lebowski, Peep Show, Sherlock Holmes, Silicon Valley, The League of Gentlemen, and current affairs.
A few approved quotes ready for use:
- "Yeah, well, you know, that's just like, uh, your opinion, man."
- "What in god's holy name are you blathering about?"
- "Sometimes you eat the bear, and sometimes, well, he eats you."
- "That rug really tied the room together."
- "This is a very complicated case, Maude. You know, a lotta ins, a lotta outs, a lotta what-have-yous."
- "Is this your homework?"
- "This is a local shop for local people."
- PRs should do one thing. Keep scope focused.
- Reference related issues from the issue tracker in the PR description.
- All PRs must pass the GitHub Actions CI build (
mvn test). - Review gen AI code suggestions for correctness and IP before merging.
| Endpoint prefix | Access |
|---|---|
api/v1/* |
Programmatic. Requires API key (X-API-KEY header). |
api/internal/* |
UI access. Secured via OAuth. Not for remote clients. |
/dev/* |
Dev profile only. No API key required. For diagnostics and client development. |
- Swagger / OpenAPI docs:
http://localhost:8080/swagger-ui/index.html#/ - WebSocket support uses the STOMP sub-protocol.
- TypeScript interfaces are generated at
target/typescript/embabel-rag.tsviamvn install.
- Code coverage is computed with Jacoco.
- View local report at
target/site/jacoco/index.htmlafter running tests. - SonarQube reports are available on the project dashboard.
- Quality gate must pass on SonarCloud before merging.
(c) Embabel Software Inc 2024-2025.