New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
StringDataItem decoding removed #689
Conversation
This function regroup functions `decode` and `patch_string`
Moreover, adding size checking inside `decode_and_patch` function
Include `decode` and `encode` functions Warning : `decode` doesn't return a printable string
Patch concern `read_null_terminated_string` improvement
Codecov Report
@@ Coverage Diff @@
## master #689 +/- ##
=========================================
+ Coverage 72.53% 72.6% +0.07%
=========================================
Files 50 50
Lines 16057 16188 +131
=========================================
+ Hits 11647 11754 +107
- Misses 4410 4434 +24
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes seems to be ok according to tests.
Test with last updates :
Runtime : 7.62316800s
Name | TotTime | Calls | Source
read | 0.971308 | 2654552 | bytecode.py:795
get_string | 0.612781 | 498603 | dvm.py:7280
from_bytes | 0.356392 | 653981 | mutf8.py:105
__init__ | 0.348029 | 77741 | dvm.py:6486
get_byte | 0.323302 | 1021791 | dvm.py:204
__init__ | 0.312980 | 653981 | mutf8.py:93
__init__ | 0.303062 | 77741 | dvm.py:6720
<built-in method builtins.isinst | 0.255098 | 3441379 |
<built-in method _struct.unpack> | 0.241028 | 2400031 |
get_type_ref | 0.224705 | 306893 | dvm.py:7346
readuleb128 | 0.195157 | 566395 | dvm.py:208
add_type_item | 0.175564 | 35 | dvm.py:7238
__init__ | 0.165144 | 100931 | dvm.py:2424
get | 0.164950 | 537811 | dvm.py:1906
__init__ | 0.141590 | 9894 | dvm.py:1460
_load_elements | 0.141182 | 44096 | dvm.py:3377
read_null_terminated_string | 0.127051 | 102793 | dvm.py:97
get_type | 0.126590 | 306893 | dvm.py:7331
__init__ | 0.115519 | 11024 | dvm.py:3258
__init__ | 0.111835 | 82413 | dvm.py:2823
It may require to update documentation.
What do you think about moving MUTF8String into androguard.core.bytecodes.dvm
? And maybe renaming it as DalvikString
?
wow, you have been very busy :) mh MUTF8 is not only used in Dalvik but in the whole java world... So I would leave it as MUTF8String. |
@reox Maybe put |
I moved the package to |
sure! I pushed some bugfixes, that unfortunately broke your PR, sorry! you could also merge all your three PRs into a single one. I think it is enough to merge the branches locally and push one - that should automatically close the other two and add them to one PR |
ah I meant your two PRs :D |
Fixing MUTF8String bug when calling __getitem__ with int
Mhh last time I did that, the other PR would close... maybe that changed or I did something different :D anyways, merging these now! |
Ah... Just realized: The other PR was closed automatically when this one was merged! |
As told in #686, not decoding the strings during parsing could bring a huge performance gain. I did a new class
MUTF8String
for replacing the results ofdecode(b)
andpatch_string(decode(b))
and capable of decoding when needed.Test with
patch_string(decode(b))
:Test with
MUTF8String(b)
:It's also possible to optimize
read_null_terminated_string
as I did in #686.Resolves : #684