# Reading and understanding a Java program by your program

To analyze a Java program, it is necessary to read Java source code from your program. Among many available libraries, one useful python library is `javalang` (https://github.com/c2nes/javalang). After installing the library by using `pip`, you try to use the following example:

In [2]:
import javalang
your_ast = javalang.parse.parse("package javalang.brewtab.com; class Test {}")
print(your_ast)

CompilationUnit(imports=[], package=PackageDeclaration(annotations=None, documentation=None, modifiers=None, name=javalang.brewtab.com), types=[ClassDeclaration(annotations=[], body=[], documentation=None, extends=None, implements=None, modifiers=set(), name=Test, type_parameters=None)])


The `parse` function in `javalang.parse` package takes a string (i.e., source code text) and produces a tree-structured output of the given string. The output is an AST (abstract syntax tree). You can retrieve some specific information from the AST as well:

In [4]:
print(your_ast.package.name)

javalang.brewtab.com


In [5]:
print(your_ast.types[0])

ClassDeclaration(annotations=[], body=[], documentation=None, extends=None, implements=None, modifiers=set(), name=Test, type_parameters=None)


In [6]:
print(your_ast.types[0].name)

Test


We can also try to parse a partial source code (snippet) of a Java program.

In [8]:
partial_ast = javalang.parse.parse("""System.out.println("Hello " + "world");""")
print(partial_ast)

JavaSyntaxError: 

Right, it is not working. You need to use the following functions:

In [9]:
# tokenize first,
tokens = javalang.tokenizer.tokenize('System.out.println("Hello " + "world");')
# then parse it.
parser = javalang.parser.Parser(tokens)
parser.parse_expression()

MethodInvocation(arguments=[BinaryOperation(operandl=Literal(postfix_operators=[], prefix_operators=[], qualifier=None, selectors=[], value="Hello "), operandr=Literal(postfix_operators=[], prefix_operators=[], qualifier=None, selectors=[], value="world"), operator=+)], member=println, postfix_operators=[], prefix_operators=[], qualifier=System.out, selectors=[], type_arguments=None)

You can only parse appropriate types from a snippet:

In [10]:
parser.parse_identifier()

JavaSyntaxError: 

Let's parse a larger program.