Among the various mobile operating systems available today, Android has emerged as the most popular. The open architecture and large user base of Android have made it particularly vulnerable to malware attacks, which can have serious consequences for both individuals and organizations. Furthermore, the potential risks of such attacks are only increasing as mobile devices become more integrated into our daily lives and store even more sensitive information. Indeed malware on Android can have a range of negative effects, including financial loss, identity theft, and even physical harm.
We propose a hierarchical graph neural network to build an Android malware detection model. Hierarchical graph representations incorporate inter-function call graph (FCG) and intra-function control flow graphs (CFG) for representing each APK.
The model was inspired by the research work of Ling et al., 2017 who used the same architecture to construct MalGraph, a malware detection model for Windows PE.
APK representations using a hierarchical graph combine the Function Call Graph (FCG) and the Control Flow Graph (CFG). More precisely, an APK is represented by a FCG which has two types of nodes:
- External methods, which will initially be encoded as a one-hot vector based on their method names
- Local methods, which will be represented by the corresponding CFG.
To achieve this structure, we define a pipeline of operations that includes the following steps:
- Download the dataset
- Unzip, extract
.dexfile and analyzeAndroidManifestfor each APK - Import the
.dexfiles on Ghidra tool - Launch Ghidra script to construct FCG and CFGs
Our model aims to predict the probability of malicious or benign for a given Android Application. The hierarchical structure that incorporates the inter-function FCG with intra-function CFGs of an Andorid APK is passed as input to the model which will generate the probability for it to be a malware. Image below show the main steps of our approach:
- Create a virtual environment running the following code:
python -m venv ./venv
-
Activate thevirtual environment by executing the script contained in the folder venv
-
Run the installation of requirements
pip install -r ./requirements.txt
now you are ready to execute the code.


