Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

huang_bidtext #14

Open
hjjandy opened this issue Jun 30, 2016 · 10 comments
Open

huang_bidtext #14

hjjandy opened this issue Jun 30, 2016 · 10 comments
Assignees

Comments

@hjjandy
Copy link
Collaborator

hjjandy commented Jun 30, 2016

No description provided.

@brusso123
Copy link

Dear authors,

the artefact is potentially insightful, useful, or usable artefact and deserves to be catalogued.
It is useful for researchers working with mobile apps as it provides a reduction of the FP and FN in detecting sensitive data which would imply otherwise tedious and long inspection activity.

The description on the use of the artefacts is clearly provided in the github project. The project can be easily updated with git and arable. It is not clear to me if mac users will have some issues in installing the tool though.

Barbara

@davidlo83
Copy link

Dear Jianjun, Xiangyu, Lin, Tim, and Olga,

The following are my comments and recommendation:

[Insightful]

Timely (i.e., addresses a problem that is most current and most pressing)?
The paper addresses the issue of sensitive data disclosure to deal with privacy issue on mobile platform. Currently a large number of users are using mobile platforms and privacy issue is one of the key concerns of users. Thus, the artifact is timely.

Makes researchers “smarter” in some way (e.g., identifies and fills some significant gap in prior work)?
The artifact extends SUPOR and UIPicker to identify generic API invocations that may return sensitive information. The artifact examines the correlation between text labels and performs forward and backward data analysis to propagate the labels. It would be best if the authors further elaborate the difference between the current work and the existing works. I'm not an expert on Android privacy analysis and the details in the short paper is not sufficient for me.

[Useful]

Serves a useful purpose?
App developers and markets can use the artifact to detect potentially sensitive data disclosures.

Serves a purpose that would otherwise be tedious, prolonged, awkward, or impossible?
The paper mentions that some "potentially sensitive data disclosures ... may not be able to be detected by existing solutions". However, it does not elaborate further. More information is needed to elaborate on to what extent can the artifact improves over existing work.

Cost-effective?
No information is given.

[Usable (co-reviewed by Ferdian Thung)]

Easy to understand?

For the most part, the documentation is easy to follow. Step by step guide is given for running the program. Pointer on what to do is always given. However, user is assumed to have familiarity on creating virtual machine using VirtualBox by using the given vmdk files. It also assumes knowledge of installing using gradle.

Accompanied by tutorial notes?

Notes are provided in Github (downloading compiled files, test apps, virtual machine disks), Bitbucket (general info on source code), artifact paper (general), and README inside the virtual machine desktop (general, how to run the code, read result). No documentation on how to load virtual machine disks and how to compile.

Artifacts:

Download for virtual machine and source code is available. Update of compiled files, test apps, virtual machine disks, and source code is supported through git. Compile and execution run fine.

[Recommendation]

maybe platinum

Yours Sincerely,

David

@hjjandy
Copy link
Collaborator Author

hjjandy commented Jul 24, 2016

Dear Barbara @brusso123,

For Mac users, I think they can install VirtualBox for Mac (Intel hardware: http://download.virtualbox.org/virtualbox/5.1.0/VirtualBox-5.1.0-108711-OSX.dmg) and play the tool provided with VM images. Also they can install Java 8 and play the tool of non-VM artifacts. We do not have Mac by hand so we didn't test the tool on that platform.

Thanks,
Jianjun

@timm
Copy link
Contributor

timm commented Jul 25, 2016

Note these labels are still "under discussion" and are still subject to change prior to the final notifications Friday.

@emhill
Copy link

emhill commented Jul 26, 2016

I can confirm that the vmdk files won't load into virtual box on a mac. I've googled the issue and found the same directions for doing it over and over again, but my installation of virtual box is giving me an error:

screen shot 2016-07-26 at 7 04 14 pm

If anyone has a suggested workaround, please let me know!

I've found that ova files tend to be less error prone when sharing across VMware and VirtualBox on different platforms (it's what I use to share VMs with my students).

I don't have VMware player installed at the moment to test. I bow to the other reviewers' experience.

@emhill emhill added thrice and removed twice labels Jul 26, 2016
@hjjandy
Copy link
Collaborator Author

hjjandy commented Jul 27, 2016

@emhill , thanks for the feedback. We do not have a Mac to test the VM images so could you please test the tool using the non-VM artifact (https://github.com/hjjandy/FSE16-BidText-Artifacts), which contains everything as in the VM excluding the source code (https://bitbucket.org/hjjandy/toydroid.bidtext)? The non-VM artifact are tested on Windows 10 and Ubuntu 14.04 with Java 8 (x64) installed. If the provided start scripts work, I think the non-VM artifact can also work well on a Mac.

@emhill
Copy link

emhill commented Jul 27, 2016

Thanks for pointing out this other link, I was able to run it using those sources. It might be helpful to suggest in the text which OS works best with which type of artifact, so users don't spend time trying to download & install one that is unlikely to work.

@emhill
Copy link

emhill commented Jul 27, 2016

Sorry, one more question: when I run it on the Motivating example, should it give exceptions on the stanford parser? I assume not (see exceptions thrown below). If I recall correctly, I don't _think_ englishPCFG.ser.gz is included in the stanford jars.

2016-07-27 08:53:26.682 INFO  |  - dump text for sink: android.util.Log.d(Ljava/lang/String;Ljava/lang/String;)I [AnalysisUtil]
Loading parser from serialized file edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz ...
java.io.IOException: Unable to resolve "edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz" as either class path, filename or URL
    at edu.stanford.nlp.io.IOUtils.getInputStreamFromURLOrClasspathOrFileSystem(IOUtils.java:448)
    at edu.stanford.nlp.io.IOUtils.readStreamFromString(IOUtils.java:381)
    at edu.stanford.nlp.parser.lexparser.LexicalizedParser.getParserFromSerializedFile(LexicalizedParser.java:628)
    at edu.stanford.nlp.parser.lexparser.LexicalizedParser.getParserFromFile(LexicalizedParser.java:423)
    at edu.stanford.nlp.parser.lexparser.LexicalizedParser.loadModel(LexicalizedParser.java:182)
    at edu.stanford.nlp.parser.lexparser.LexicalizedParser.loadModel(LexicalizedParser.java:161)
    at edu.purdue.cs.toydroid.bidtext.analysis.TextAnalysis.check(TextAnalysis.java:230)
    at edu.purdue.cs.toydroid.bidtext.analysis.TextAnalysis.analyze(TextAnalysis.java:156)
    at edu.purdue.cs.toydroid.bidtext.analysis.AnalysisUtil.dumpTextForSink(AnalysisUtil.java:180)
    at edu.purdue.cs.toydroid.bidtext.analysis.AnalysisUtil.dumpTextForSinks(AnalysisUtil.java:123)
    at edu.purdue.cs.toydroid.bidtext.TextLeak.analyze(TextLeak.java:267)
    at edu.purdue.cs.toydroid.bidtext.TextLeak.call(TextLeak.java:165)
    at edu.purdue.cs.toydroid.bidtext.TextLeak.call(TextLeak.java:55)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Loading parser from text file edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz java.io.IOException: Unable to resolve "edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz" as either class path, filename or URL
    at edu.stanford.nlp.io.IOUtils.getInputStreamFromURLOrClasspathOrFileSystem(IOUtils.java:448)
    at edu.stanford.nlp.io.IOUtils.readerFromString(IOUtils.java:575)
    at edu.stanford.nlp.parser.lexparser.LexicalizedParser.getParserFromTextFile(LexicalizedParser.java:562)
    at edu.stanford.nlp.parser.lexparser.LexicalizedParser.getParserFromFile(LexicalizedParser.java:425)
    at edu.stanford.nlp.parser.lexparser.LexicalizedParser.loadModel(LexicalizedParser.java:182)
    at edu.stanford.nlp.parser.lexparser.LexicalizedParser.loadModel(LexicalizedParser.java:161)
    at edu.purdue.cs.toydroid.bidtext.analysis.TextAnalysis.check(TextAnalysis.java:230)
    at edu.purdue.cs.toydroid.bidtext.analysis.TextAnalysis.analyze(TextAnalysis.java:156)
    at edu.purdue.cs.toydroid.bidtext.analysis.AnalysisUtil.dumpTextForSink(AnalysisUtil.java:180)
    at edu.purdue.cs.toydroid.bidtext.analysis.AnalysisUtil.dumpTextForSinks(AnalysisUtil.java:123)
    at edu.purdue.cs.toydroid.bidtext.TextLeak.analyze(TextLeak.java:267)
    at edu.purdue.cs.toydroid.bidtext.TextLeak.call(TextLeak.java:165)
    at edu.purdue.cs.toydroid.bidtext.TextLeak.call(TextLeak.java:55)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
2016-07-27 08:53:26.709 ERROR | ExecutionException: java.lang.NullPointerException [TextLeak]
java.util.concurrent.ExecutionException: java.lang.NullPointerException
    at java.util.concurrent.FutureTask.report(FutureTask.java:122)
    at java.util.concurrent.FutureTask.get(FutureTask.java:206)
    at edu.purdue.cs.toydroid.bidtext.TextLeak.doAnalysis(TextLeak.java:121)
    at edu.purdue.cs.toydroid.bidtext.TextLeak.main(TextLeak.java:85)
Caused by: java.lang.NullPointerException
    at edu.purdue.cs.toydroid.bidtext.analysis.TextAnalysis.check(TextAnalysis.java:233)
    at edu.purdue.cs.toydroid.bidtext.analysis.TextAnalysis.analyze(TextAnalysis.java:156)
    at edu.purdue.cs.toydroid.bidtext.analysis.AnalysisUtil.dumpTextForSink(AnalysisUtil.java:180)
    at edu.purdue.cs.toydroid.bidtext.analysis.AnalysisUtil.dumpTextForSinks(AnalysisUtil.java:123)
    at edu.purdue.cs.toydroid.bidtext.TextLeak.analyze(TextLeak.java:267)
    at edu.purdue.cs.toydroid.bidtext.TextLeak.call(TextLeak.java:165)
    at edu.purdue.cs.toydroid.bidtext.TextLeak.call(TextLeak.java:55)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

I think the issue is with the models jar that's included. I can't even see the contents when I try to display them on the command line:

$ jar tf BidText-Bin/lib/stanford-parser-3.4.1-models.jar 
java.util.zip.ZipException: error in opening zip file
    at java.util.zip.ZipFile.open(Native Method)
    at java.util.zip.ZipFile.<init>(ZipFile.java:220)
    at java.util.zip.ZipFile.<init>(ZipFile.java:150)
    at java.util.zip.ZipFile.<init>(ZipFile.java:121)
    at sun.tools.jar.Main.list(Main.java:1060)
    at sun.tools.jar.Main.run(Main.java:291)
    at sun.tools.jar.Main.main(Main.java:1233)

@hjjandy
Copy link
Collaborator Author

hjjandy commented Jul 27, 2016

Dear @emhill , the models jar is a large file(>100MB). It is stored using Git LFS (https://git-lfs.github.com/). Please execute "git lfs pull" to download those files (see README.md). Make sure the LFS plugin is installed. On Ubuntu 14.04, with Git LFS installed, "git clone" will automatically download those files stored with LFS.

jjhuang@H-J2:~/FSE16-Artifacts$ git clone https://github.com/hjjandy/FSE16-BidText-Artifacts.git
Cloning into 'FSE16-BidText-Artifacts'...
remote: Counting objects: 382, done.
remote: Total 382 (delta 0), reused 0 (delta 0), pack-reused 382
Receiving objects: 100% (382/382), 71.53 MiB | 3.34 MiB/s, done.
Resolving deltas: 100% (163/163), done.
Checking connectivity... done.
Downloading BidText-Bin/lib/stanford-parser-3.4.1-models.jar (173.71 MB)
Downloading BidText-TestApps/Others/Kal.FlightInfo-123.apk (3.38 MB)

@timm
Copy link
Contributor

timm commented Jul 27, 2016

Dear @hjjandy and @emhill. That's enough work on this one. We are moving to final decisions now.

@timm timm closed this as completed Aug 1, 2016
@timm timm reopened this Aug 1, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants