-
Notifications
You must be signed in to change notification settings - Fork 275
feat: add import detection (APP-77) #27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add some description of PR next time. Describe what you did and your intention on merging to master or just to keeping an updates on WIP.
Possibly regexs not working and can be optimized, I include fix in comments. Also, we should make tests in next PR.
I will implement empty methods and integrate those classes.
| override fun extractImports(fileContent: List<String>): List<String> { | ||
| val libraries = mutableSetOf<String>() | ||
|
|
||
| val quoteRegex = Regex("#import\\s+\"(\\w+)[/\\w+]*\\.\\w+\"") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you test it? Probably shouldn't work.
Also, you can use only one regex with OR: #import\s+[\"|<](\w+)[/\w+]*.\w+[\"|>]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix double slash everywhere.
updates: Double slash ok here, but better to use triple quotes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#include maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
| override fun extractImports(fileContent: List<String>): List<String> { | ||
| val libraries = mutableSetOf<String>() | ||
|
|
||
| val regex = Regex("use\\s+(\\w+)[\\\\\\w+]*") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
require, require_once?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
include, include once?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
| override fun extractImports(fileContent: List<String>): List<String> { | ||
| val libraries = mutableSetOf<String>() | ||
|
|
||
| val csLibraries = File("data/libraries/cs_libraries.txt") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be definitely optimized cuz everytime extractImports called a file will be read. IO operations take a lot of time. I'll do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.
| override fun extractImports(fileContent: List<String>): List<String> { | ||
| val libraries = mutableSetOf<String>() | ||
|
|
||
| val javaLibraries = File("data/libraries/java_libraries.txt") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In some Extractors you using list of imports, in some not, why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It depends. In js, there are sometimes no import lines, that's why we tokenize all lines and see what tokens are in the import list. In this case we should catch things like ((com|ru).)?organization.library.modules.* and also skip imports of inner files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you gonna use this libraries files in a classifier for every language?
| override fun extractImports(fileContent: List<String>): List<String> { | ||
| val libraries = mutableSetOf<String>() | ||
|
|
||
| val importQuotesRegex = Regex("#import\\s+\"(\\w+)[/\\w+]*\\.\\w+\"") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use OR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
| import app.model.DiffFile | ||
| import java.io.File | ||
|
|
||
| class CsharpExtractor : ExtractorInterface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CSharpExtractor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.
|
|
||
| class CExtractor : ExtractorInterface { | ||
| override fun extract(files: List<DiffFile>): List<CommitStats> { | ||
| TODO("not implemented") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This mean that exception will be thrown if this method will be called. Not a good idea to merge such things to master.
| override fun extractImports(fileContent: List<String>): List<String> { | ||
| val libraries = mutableSetOf<String>() | ||
|
|
||
| val quoteRegex = Regex("#import\\s+\"(\\w+)[/\\w+]*\\.\\w+\"") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#include maybe?
| override fun extractImports(fileContent: List<String>): List<String> { | ||
| val libraries = mutableSetOf<String>() | ||
|
|
||
| val quoteRegex = Regex("#import\\s+\"(\\w+)[/\\w+]*\\.\\w+\"") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same
| val regex = Regex("import\\s+(\\w+[.\\w+]*)") | ||
| fileContent.forEach { | ||
| val res = regex.find(it) | ||
| if (res != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (res == null) return – inverse it to achieve lesser amount of nesting
| override fun extractImports(fileContent: List<String>): List<String> { | ||
| val libraries = mutableSetOf<String>() | ||
|
|
||
| val regex = Regex("use\\s+(\\w+)[\\\\\\w+]*") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
include, include once?
sergey48k
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
approved by see comments
| files.map { file -> | ||
| file.old.imports = extractImports(file.old.content) | ||
| file.new.imports = extractImports(file.new.content) | ||
| file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
???
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last line is return value.
| val TYPE_LANGUAGE = 1 | ||
| val TYPE_KEYWORD = 2 | ||
|
|
||
| val SEPARATOR = ">" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this used at all? maybe it can be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Used in JavaExtractor for example to separate language from tech (or keywords, etc). "Java>Keyword".
|
|
||
| package app.model | ||
|
|
||
| import org.eclipse.jgit.diff.Edit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copyright/author
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is.
* feat: add import detection(APP-77) * feat: integrate import detection * feat: add data for libraries, fix php regex matching
No description provided.