-
Notifications
You must be signed in to change notification settings - Fork 285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
javasrc2cpg: Handling Generic Types + Type Arguments #2655
base: master
Are you sure you want to change the base?
Conversation
* Fixed bug where `MethodReturn` type will be `ANY` for generic types due to type arguments being added to the search * Created a prototype Type->TypeArgument tree for representing nested type arguments * Added `evalType` edges to `Ast` class
joern-cli/frontends/javasrc2cpg/src/main/scala/io/joern/javasrc2cpg/passes/AstCreator.scala
Show resolved
Hide resolved
Okay, I have a plan: Java will simply give types with fully qualified type arguments (if possible), i.e. instead of Then we can query e.g. @johannescoetzee does this sound more sane? |
Regarding the type edges, the TypeUsagePass is perhaps a good reference (at least as a start). That creates Type->REF->TypeDecl edges, along with the eval types to match. |
TypeUsagePass seems to use the linking util pretty blindly, I don't think polymorphic types will be handled well here. Seems like But this is a future issue and probably more appropriate for internal generic classes in any case |
@DavidBakerEffendi Fabs and I discussed this a while ago in the context of inheritsFrom and decided against including generic types in the typeFullName. This was a while ago, so I can't remember exactly why we decided this. I think it was for policy compatibility with the closed source java frontend. Since our closed source policies don't include the generic types, we omitted those from javasrc *fullNames as well. It might be possible to add a workaround on the closed source side, but for now I wouldn't go with this approach. |
Yeah, it doesn't handle polymorphic types well at all. I was just suggesting it as a reference for the currently expected edge types. There's definitely still a lot of room for improvement for how we handle generics in general. |
@johannescoetzee okay for backwards compatibility reasons I could try register nodes that are eval'ed to generic types and accumulate then on one side, then have TypeUsagePass run over them as normal. Using the generic type info I kept aside, I can generate additional generic type nodes to add on top of that? So we would see both List and List. But the node itself would only have List. Then if details of a generic type is available we can see it via a query. |
This is perhaps a conversation that @ml86 should be a part of, since he'll have a much better idea of what the implications of this are for the closed source dataflow tracker. I'm not sure how important uniqueness of types is, but I know uniqueness of methodFullNames is a topic that's been brought up. |
TypeArguments and TypeParameter nodes where added in the early days of the CPG as placeholders, but so far have never been implemented by any frontend. This is also the reason why there are basically no querying facilities, we just never needed them. Changing the full names and structure is pretty much out of question since we need to keep compatibility to the internal byte code frontend. I am curious why you need the generic types in the CPG? After all in the JVM world the type erased type names are what counts. |
@ml86 One example is generating API definitions from method headers + Spring annotations. If there is
In the latest commit, I've gone ahead and added a But there is code to effectively create the type argument tree in the CPG. Hopefully then from this tree we would be able to query around the arguments. |
If the schema is up for change, then I think I have an okay solution and propose something with a diagram sometime. |
A bit rudimentary diagram, but since the description of Type nodes are that they are an instance of a TypeDecl, then we can keep instances of them that have TypeArgument nodes that can be queried. These can all point to the same TypeDecl as usual. Then I've also kept the default Type node that would be generated by TypeUsagePass. |
@ml86 and I had a chat about this one offline. The correct way to use the schema was explained, so I'll be able to handle that, but where we implement this comes with two options, where the main goal is to avoid duplication. It was also mentioned that With that, here are two proposed options:
|
MethodReturn
type will beANY
for generic types due to type arguments being added to the searchevalType
edges toAst
classCurrently, my
Type-AST->TypeArgument(-AST>TypeArgument)*
edges are not working... I think this needs to beType-REF>TypeDecl->TypeParameter(-AST>TypeParameter)*
?I suppose
List<Long>
andList<Integer>
get their ownTypeDecl
for this kind of schema? Right now, there is onlyList