Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
72 changes: 67 additions & 5 deletions jacodb-analysis/README.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,73 @@
# Module jacodb-analysis

Module for custom analysis
Analysis module allows launching dataflow analyses of applications.
It contains API to write custom analyses, along with several implemented ready-to-use analyses.

## IFDS
## Concept of units

TODO
The [IFDS](https://dx.doi.org/10.1145/199448.199462) framework is used as the basis for this module.
However, in order to be scalable, the analyzed code is split into so-called units, so that the framework
can analyze them concurrently.
Information is shared between the units via summaries, but the lifecycle of each unit is controlled
separately.

## Points To
## Get started

TODO
The entry point of the analysis is the [runAnalysis] method. In order to call it, you have to provide:
* `graph` — an application graph that is used for analysis. To obtain this graph, one should call the [newApplicationGraphForAnalysis] method.
* `unitResolver` — an object that groups methods into units. Choose one from `UnitResolversLibrary`.
Note that in general, larger units mean more precise but also more resource-consuming analysis.
* `ifdsUnitRunner` — an [IfdsUnitRunner] instance, which is used to analyze each unit. This is what defines concrete analysis.
Ready-to-use runners are located in `RunnersLibrary`.
* `methods` — a list of methods to analyze.

For example, to detect unused variables in the given `analyzedClass` methods, you may run the following code
(assuming that `classpath` is an instance of [JcClasspath]):

```kotlin
val applicationGraph = runBlocking {
classpath.newApplicationGraphForAnalysis()
}

val methodsToAnalyze = analyzedClass.declaredMethods
val unitResolver = MethodUnitResolver
val runner = UnusedVariableRunner

runAnalysis(applicationGraph, unitResolver, runner, methodsToAnalyze)
```

## Implemented runners

By now, the following runners are implemented:
* `UnusedVariableRunner` that can detect issues like unused variable declaration, unused return value, etc.
* `NpeRunner` that can find instructions with possible null-value dereference.
* Generic `TaintRunner` that can perform taint analysis.
* `SqlInjectionRunner` which find places vulnerable to sql injections, thus performing a specific kind of taint analysis.

## Implementing your own analysis

To implement a simple one-pass analysis, use [IfdsBaseUnitRunner].
To instantiate it, you need an [AnalyzerFactory] instance, which is an object that can create [Analyzer] via
[JcApplicationGraph].

To instantiate an [Analyzer] interface, you have to specify the following:

* `flowFunctions` which describe dataflow facts and their transmissions during the analysis.

* How vulnerabilities are produced by these facts, i.e. you have to implement `getSummaryFacts` and `getSummaryFactsPostIfds` methods.

To implement bidirectional analysis, you may use composite [SequentialBidiIfdsUnitRunner] and [ParallelBidiIfdsUnitRunner].

<!--- MODULE jacodb-analysis -->
<!--- INDEX org.jacodb.analysis -->

[runAnalysis]: https://jacodb.org/docs/jacodb-analysis/org.jacodb.analysis/run-analysis.html
[newApplicationGraphForAnalysis]: https://jacodb.org/docs/jacodb-analysis/org.jacodb.analysis/new-application-graph-for-analysis.html
[IfdsUnitRunner]: https://jacodb.org/docs/jacodb-analysis/org.jacodb.analysis.engine/-ifds-unit-runner/index.html
[JcClasspath]: https://jacodb.org/docs/jacodb-api/org.jacodb.api/-jc-classpath/index.html
[IfdsBaseUnitRunner]: https://jacodb.org/docs/jacodb-analysis/org.jacodb.analysis.engine/-ifds-base-unit-runner/index.html
[AnalyzerFactory]: https://jacodb.org/docs/jacodb-analysis/org.jacodb.analysis.engine/-analyzer-factory/index.html
[Analyzer]: https://jacodb.org/docs/jacodb-analysis/org.jacodb.analysis.engine/-analyzer/index.html
[JcApplicationGraph]: https://jacodb.org/docs/jacodb-api/org.jacodb.api.analysis/-jc-application-graph/index.html
[SequentialBidiIfdsUnitRunner]: https://jacodb.org/docs/jacodb-analysis/org.jacodb.analysis.engine/-sequential-bidi-ifds-base-unit-runner/index.html
[ParallelBidiIfdsUnitRunner]: https://jacodb.org/docs/jacodb-analysis/org.jacodb.analysis.engine/-parallel-bidi-ifds-base-unit-runner/index.html
134 changes: 46 additions & 88 deletions jacodb-analysis/src/main/kotlin/org/jacodb/analysis/AnalysisMain.kt
Original file line number Diff line number Diff line change
Expand Up @@ -14,103 +14,61 @@
* limitations under the License.
*/

@file:JvmName("AnalysisMain")
package org.jacodb.analysis

import kotlinx.serialization.Serializable
import mu.KLogging
import org.jacodb.analysis.analyzers.AliasAnalyzerFactory
import org.jacodb.analysis.analyzers.NpeAnalyzerFactory
import org.jacodb.analysis.analyzers.NpePrecalcBackwardAnalyzerFactory
import org.jacodb.analysis.analyzers.SqlInjectionAnalyzerFactory
import org.jacodb.analysis.analyzers.SqlInjectionBackwardAnalyzerFactory
import org.jacodb.analysis.analyzers.TaintAnalysisNode
import org.jacodb.analysis.analyzers.TaintAnalyzerFactory
import org.jacodb.analysis.analyzers.TaintBackwardAnalyzerFactory
import org.jacodb.analysis.analyzers.TaintNode
import org.jacodb.analysis.analyzers.UnusedVariableAnalyzerFactory
import org.jacodb.analysis.engine.IfdsBaseUnitRunner
import org.jacodb.analysis.engine.SequentialBidiIfdsUnitRunner
import org.jacodb.analysis.engine.TraceGraph
import org.jacodb.analysis.engine.IfdsUnitManager
import org.jacodb.analysis.engine.IfdsUnitRunner
import org.jacodb.analysis.engine.Summary
import org.jacodb.analysis.engine.UnitResolver
import org.jacodb.analysis.engine.VulnerabilityInstance
import org.jacodb.analysis.graph.newApplicationGraphForAnalysis
import org.jacodb.api.JcMethod
import org.jacodb.api.cfg.JcExpr
import org.jacodb.api.cfg.JcInst

@Serializable
data class DumpableVulnerabilityInstance(
val vulnerabilityType: String,
val sources: List<String>,
val sink: String,
val traces: List<List<String>>
)

@Serializable
data class DumpableAnalysisResult(val foundVulnerabilities: List<DumpableVulnerabilityInstance>)

data class VulnerabilityInstance(
val vulnerabilityType: String,
val traceGraph: TraceGraph
) {
private fun JcInst.prettyPrint(): String {
return "${toString()} (${location.method}:${location.lineNumber})"
}

fun toDumpable(maxPathsCount: Int): DumpableVulnerabilityInstance {
return DumpableVulnerabilityInstance(
vulnerabilityType,
traceGraph.sources.map { it.statement.prettyPrint() },
traceGraph.sink.statement.prettyPrint(),
traceGraph.getAllTraces().take(maxPathsCount).map { intermediatePoints ->
intermediatePoints.map { it.statement.prettyPrint() }
}.toList()
)
}
}
import org.jacodb.api.analysis.JcApplicationGraph

fun List<VulnerabilityInstance>.toDumpable(maxPathsCount: Int = 3): DumpableAnalysisResult {
return DumpableAnalysisResult(map { it.toDumpable(maxPathsCount) })
}
internal val logger = object : KLogging() {}.logger

typealias AnalysesOptions = Map<String, String>

@Serializable
data class AnalysisConfig(val analyses: Map<String, AnalysesOptions>)

val UnusedVariableRunner = IfdsBaseUnitRunner(UnusedVariableAnalyzerFactory)

fun newSqlInjectionRunner(maxPathLength: Int = 5) = SequentialBidiIfdsUnitRunner(
IfdsBaseUnitRunner(SqlInjectionAnalyzerFactory(maxPathLength)),
IfdsBaseUnitRunner(SqlInjectionBackwardAnalyzerFactory(maxPathLength)),
)

fun newNpeRunner(maxPathLength: Int = 5) = SequentialBidiIfdsUnitRunner(
IfdsBaseUnitRunner(NpeAnalyzerFactory(maxPathLength)),
IfdsBaseUnitRunner(NpePrecalcBackwardAnalyzerFactory(maxPathLength)),
)

fun newAliasRunner(
generates: (JcInst) -> List<TaintAnalysisNode>,
sanitizes: (JcExpr, TaintNode) -> Boolean,
sinks: (JcInst) -> List<TaintAnalysisNode>,
maxPathLength: Int = 5
) = IfdsBaseUnitRunner(AliasAnalyzerFactory(generates, sanitizes, sinks, maxPathLength))

fun newTaintRunner(
isSourceMethod: (JcMethod) -> Boolean,
isSanitizeMethod: (JcMethod) -> Boolean,
isSinkMethod: (JcMethod) -> Boolean,
maxPathLength: Int = 5
) = SequentialBidiIfdsUnitRunner(
IfdsBaseUnitRunner(TaintAnalyzerFactory(isSourceMethod, isSanitizeMethod, isSinkMethod, maxPathLength)),
IfdsBaseUnitRunner(TaintBackwardAnalyzerFactory(isSourceMethod, isSinkMethod, maxPathLength))
)

fun newTaintRunner(
sourceMethodMatchers: List<String>,
sanitizeMethodMatchers: List<String>,
sinkMethodMatchers: List<String>,
maxPathLength: Int = 5
) = SequentialBidiIfdsUnitRunner(
IfdsBaseUnitRunner(TaintAnalyzerFactory(sourceMethodMatchers, sanitizeMethodMatchers, sinkMethodMatchers, maxPathLength)),
IfdsBaseUnitRunner(TaintBackwardAnalyzerFactory(sourceMethodMatchers, sinkMethodMatchers, maxPathLength))
)

internal val logger = object : KLogging() {}.logger
/**
* This is the entry point for every analysis.
* Calling this function will find all vulnerabilities reachable from [methods].
*
* @param graph instance of [JcApplicationGraph] that provides mixture of CFG and call graph
* (called supergraph in RHS95).
* Usually built by [newApplicationGraphForAnalysis].
*
* @param unitResolver instance of [UnitResolver] which splits all methods into groups of methods, called units.
* Units are analyzed concurrently, one unit will be analyzed with one call to [IfdsUnitRunner.run] method.
* In general, larger units mean more precise, but also more resource-consuming analysis, so [unitResolver] allows
* to reach compromise.
* It is guaranteed that [Summary] passed to all units is the same, so they can share information through it.
* However, the order of launching and terminating analysis for units is an implementation detail and may vary even for
* consecutive calls of this method with same arguments.
*
* @param ifdsUnitRunner an [IfdsUnitRunner] instance that will be launched for each unit.
* This is the main argument that defines the analysis.
*
* @param methods the list of method for analysis.
* Each vulnerability will only be reported if it is reachable from one of these.
*
* @param timeoutMillis the maximum time for analysis.
* Note that this does not include time for precalculations
* (like searching for reachable methods and splitting them into units) and postcalculations (like restoring traces), so
* the actual running time of this method may be longer.
*/
fun runAnalysis(
graph: JcApplicationGraph,
unitResolver: UnitResolver<*>,
ifdsUnitRunner: IfdsUnitRunner,
methods: List<JcMethod>,
timeoutMillis: Long = Long.MAX_VALUE
): List<VulnerabilityInstance> {
return IfdsUnitManager(graph, unitResolver, ifdsUnitRunner, methods, timeoutMillis).analyze()
}
38 changes: 38 additions & 0 deletions jacodb-analysis/src/main/kotlin/org/jacodb/analysis/Dumpable.kt
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
/*
* Copyright 2022 UnitTestBot contributors (utbot.org)
* <p>
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
* <p>
* http://www.apache.org/licenses/LICENSE-2.0
* <p>
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.jacodb.analysis

import kotlinx.serialization.Serializable
import org.jacodb.analysis.engine.VulnerabilityInstance

/**
* Simplified version of [VulnerabilityInstance] that contains only serializable data.
*/
@Serializable
data class DumpableVulnerabilityInstance(
val vulnerabilityType: String,
val sources: List<String>,
val sink: String,
val traces: List<List<String>>
)

@Serializable
data class DumpableAnalysisResult(val foundVulnerabilities: List<DumpableVulnerabilityInstance>)

fun List<VulnerabilityInstance>.toDumpable(maxPathsCount: Int = 3): DumpableAnalysisResult {
return DumpableAnalysisResult(map { it.toDumpable(maxPathsCount) })
}
Original file line number Diff line number Diff line change
Expand Up @@ -20,35 +20,88 @@ import org.jacodb.api.JcMethod
import org.jacodb.api.analysis.JcApplicationGraph
import org.jacodb.api.cfg.JcInst

/**
* Interface for flow functions -- mappings of kind DomainFact -> Collection of DomainFacts
*/
fun interface FlowFunctionInstance {
fun compute(fact: DomainFact): Collection<DomainFact>
}

/**
* An interface with which facts appearing in analyses should be marked
*/
interface DomainFact

/**
* A special [DomainFact] that always holds
*/
object ZEROFact : DomainFact {
override fun toString() = "[ZERO fact]"
}

/**
* Implementations of the interface should provide all four kinds of flow functions mentioned in RHS95,
* thus fully describing how the facts are propagated through the supergraph.
*/
interface FlowFunctionsSpace {
/**
* @return facts that may hold when analysis is started from [startStatement]
* (these are needed to initiate worklist in ifds analysis)
*/
fun obtainPossibleStartFacts(startStatement: JcInst): Collection<DomainFact>
fun obtainSequentFlowFunction(current: JcInst, next: JcInst): FlowFunctionInstance
fun obtainCallToStartFlowFunction(callStatement: JcInst, callee: JcMethod): FlowFunctionInstance
fun obtainCallToReturnFlowFunction(callStatement: JcInst, returnSite: JcInst): FlowFunctionInstance
fun obtainExitToReturnSiteFlowFunction(callStatement: JcInst, returnSite: JcInst, exitStatement: JcInst): FlowFunctionInstance
}

/**
* [Analyzer] interface describes how facts are propagated and how vulnerabilities are produced by these facts during
* the run of tabulation algorithm by [IfdsBaseUnitRunner].
*
* There are two methods that can turn facts into vulnerabilities or other [SummaryFact]s: [getSummaryFacts] and
* [getSummaryFactsPostIfds]. First is called during the analysis, each time a new path edge is found, and second
* is called only after all path edges were found.
* While some analyses really need full set of facts to find vulnerabilities, most analyses can report [SummaryFact]s
* right after some fact is reached, so [getSummaryFacts] is a recommended way to report vulnerabilities when possible.
*
* Note that methods and properties of this interface may be accessed concurrently from different threads,
* so the implementations should be thread-safe.
*
* @property flowFunctions a [FlowFunctionsSpace] instance that describes how facts are generated and propagated
* during run of tabulation algorithm.
*
* @property saveSummaryEdgesAndCrossUnitCalls when true, summary edges and cross-unit calls will be automatically
* saved to summary (usually this property is true for forward analyzers and false for backward analyzers).
*/
interface Analyzer {
val flowFunctions: FlowFunctionsSpace

val saveSummaryEdgesAndCrossUnitCalls: Boolean
get() = true

/**
* This method is called by [IfdsBaseUnitRunner] each time a new path edge is found.
*
* @return [SummaryFact]s that are produced by this edge, that need to be saved to summary.
*/
fun getSummaryFacts(edge: IfdsEdge): List<SummaryFact> = emptyList()

/**
* This method is called once by [IfdsBaseUnitRunner] when the propagation of facts is finished
* (normally or due to cancellation).
*
* @return [SummaryFact]s that can be obtained after the facts propagation was completed.
*/
fun getSummaryFactsPostIfds(ifdsResult: IfdsResult): List<SummaryFact> = emptyList()
}

/**
* A functional interface that allows to produce [Analyzer] by [JcApplicationGraph].
*
* It simplifies instantiation of [IfdsUnitRunner]s because this way you don't have to pass graph and reversed
* graph to [Analyzer]s directly, relying on runner to do it by itself.
*/
fun interface AnalyzerFactory {
fun newAnalyzer(graph: JcApplicationGraph): Analyzer
}

This file was deleted.

Loading