IA integration

Short description of the challenge

A proof score is a structured, interactive proof script written directly in the same language that is used to implement the corresponding formal specification. Proof scores have been extensively studied in the CafeOBJ and Maude languages (more in general, in the OBJ3 family of languages). We describe here some issues that we have identified in proof scores; these issues can be generalized to other theorem proving approaches.

Research team/s involved (or communities)

FADoSS - UCM - COMFIA project

Main issues / open problems related to the challenge

The main issues detected when dealing with proof scores are:

Theorem proving is difficult, hence engineers cannot deal with it and professionals are required. As a consequence, we have the next issue.
Theorem proving requires experts.
There exists a gap between specifications and implementations. This gap may affect proofs because only the specification is really verified.
Not even theorem proving provides complete assurance. This issue is related to the previous one, but we also consider here that some verification techniques, as model checking, are only applied to particular instances of the problem.
Theorem proving consumes (too much) resources. In practice, real-world software projects often operate under strict time and budget constraints, making it difficult to justify allocating developers and resources to formal methods, which are sometimes perceived as an unnecessary overhead.
Theorem proving is only useful for critical applications. As a consequence, it has been relegated to very specific scenarios.
Theorem proving does not work with real problems, so several key features are necessary: (i) the specification language must be expressive enough to represent even subtle cases in a clear and concise manner; (ii) the prover must be sufficiently powerful, offering a reasonable degree of automation and a broad set of commands to handle diverse situations; and (iii) ideally, the overall system should be user-friendly and seamlessly integrate into the development workflow.

Techniques and technologies related to the challenge

Proof scores, implemented in either CafeOBJ or Maude, are directly related to the issues above. However, we consider that the problems above and the improvements below can be applied to any interactive theorem prover.

We have identified the following enhancements:

Automatizing lemma discovery and proofs. Sledgehammer for Isabelle/HOL and CiMPG+ for proof scores are tools in this direction.
IDE and Graphical User Interface support. As modern programming languages have their own IDE, theorem provers will also benefit from this kind of technology.
Application to New Protocols. As technology evolves, theorem proving must also prove their usefulness dealing with new protocols.
Artificial intelligence support, possibly refining generative models to help users with proofs.

Example(s)

Examples are available here.

Other relevant information

This information has been exhaustively detailed in the following jornal paper

Adrián Riesco, Kazuhiro Ogata, Masaki Nakamura, Daniel Gaina, Duong Dinh Tran, Kokichi Futatsugi: Proof Scores: A Survey. ACM Comput. Surv. 57(10): 251:1-251:37 (2025)

An extended version of the paper is also available in ArXiv.

list of tools

list of challenges

IA integration

Short description of the challenge

Research team/s involved (or communities)

Main issues / open problems related to the challenge

Techniques and technologies related to the challenge

Example(s)

Other relevant information

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally