More flexible design for generator #14

yannisgu · 2014-10-20T18:08:45Z

When implementing the PR #13 I realized that the current design is not that flexible. We should discuss about some enhancements in the design 😄

Problems of the current design I see (personal opinion):

To add new features, code-changes must be performed at different places deep in the code. There is no simple possibility to extend the code
Not perfectly testable
Tightly coupled to HTML: Since a document provided by roslyn is translated directly to HTML it is not easy to add a search-index with rich metadata.

Proposed high-level design

Code analysis and html (and index) creation are two completly different steps
The code analysis creates an simple object model from the code which is then passed to the transformers
The model could like similar to this:

There are two transformers
- HTML: generates the HTML and writes the JSON documents
- Indexing: updates the search index (e.g with Lucene.Net)

Thoughts?

MarcinJuraszek · 2014-10-22T07:30:03Z

Looks much better than just concatenating strings :)

Do you want to tight simplified code structure model and internal links between read tokens within the same model? I have a bit of concerns about that. I think the linking should be done as part of HTML transform pass. It would also allow you to run code analysis on multiple files in parallel, as you don't have to have any knowledge from other files while doing that. It would be just simple transformation between Roslyn syntax tree into your own model.
Do you want to capture between related files, e.g. aspx and aspx.cs with the same name? If so, you'd need a way to relate two Document instances together.

But I was wondering if you need your simplified model at all. I think you can run your transformers on Roslyn Syntax Tree directly. Just structure it better than before :)

JoshVarty · 2014-10-22T08:14:47Z

The current design certainly leaves a lot to be desired, and was thrown together as a proof of concept to gauge whether or not others would be interested in this project. It appears they are, so I agree we should rethink the design.

I think it's a good idea to use a simplified model instead of running the transformers on Roslyn's syntax tree. One of my long term goals for this project is to support languages beyond C# and VB .Net. If we can define a simple model, then the only language-dependent functionality would be the creation of this model. We could use the HTML and search transformers on any model, regardless of the underlying language.

yannisgu · 2014-10-22T09:58:23Z

Thank you for your inputs @MarcinJuraszek and @JoshVarty

I think the extra model is it worth. You are right @MarcinJuraszek a object link across documents breaks possibility for parallelism (or add extra complexity). Maybe we can agree that the FullName of a token is the key. So we can create links based on string-keys.
Kind of relation between between documents would be great. Not sure what is better a property ParentPath in document (contains the path of the parent file/folder) or a property Parent referencing to the parent documenet-object.

ParentPath(Edit):

Parent(Edit):

MarcinJuraszek · 2014-10-22T18:42:28Z

@JoshVarty Right, I forgot about that. Simplified model is definitely better if you're planning to get support for languages beyond Roslyn project.

@yannisgu I link the Parent approach. It seems to be much more flexible. However, I don't like the name. `IFileSystemObject' suggests it's file system related structure. When actually it's Solution/Project-only relationship. For aspx and aspx.cs files there is no File System structure, they both live in the same folder. VS shows the relationship because it's aware of it.

Also, if you're planning on moving forward with other languages, you should remember that for some languages (e.g. F#) order of files/folders in project/solution is important. Adding it to the model now seems like a good idea.

yannisgu · 2014-10-22T19:15:18Z

I just can't find a good name 😞 IProjectObject, but not every language has a project system? Or IItem seems a bit too general? Anyone have a good name?

To handle order we probably need a property Childern

JoshVarty · 2014-10-22T20:09:42Z

What sort of relationship are you trying to establish between aspx and aspx.cs files with the same name?

AmadeusW · 2014-10-23T07:51:11Z

@yannisgu & @MarcinJuraszek I agree with the high level design. It will take advantages of single responsibility principle. The various transformers may execute asynchronously if HTML transformer finished and directs the web user to the html site for the project. It will be huge for testability.
As for the interface name, how about language agnostic ICodeItem or if we also represent directories, IProjectObject or IProjectItem?

@JoshVarty I think Document <-> Document, because these two files are logically connected. Similarly, .cs, .xaml.cs, and underlying .g.cs are partial classes that live in documents that have a relationship. I think that it is in scope of Document to handle logical connection, while proposed IToBeDetermined would handle physical location. Toughts?

JoshVarty · 2014-10-23T08:10:27Z

@AmadeusW

You're right. Now that I think about it, the relationship would need to be established if we were to properly display the files in the treeview. (That is display the backing .cs and .g.cs. files as children of their corresponding document the same way Visual Studio's Solution Explorer does)

yannisgu · 2014-10-23T21:34:35Z

Ok. I think we agree now all on the high-level design. The last discussion point about this design is about how to handle physical/logical relation between documents and folders.

So I hope those are the last diagrams I add to this post:

a) We handle physical and logical relation as the same thing, since a document cannot have at the same time a parent-folder and a parent-document:

or b) we separate those things, since they are not directly related

AmadeusW · 2014-10-24T05:00:34Z

I like option a.
I'd just like to add a property to the overarching Model: Version information. Sooner or later we'll find ourselves having to update the model, we should know what version is that model from so that we can process it properly.

JoshVarty · 2014-10-24T05:55:29Z

@yannisgu I prefer approach A.

One final note: TokenLink should probably be renamed to SymbolLink as we're not actually linking to a single token.

yannisgu · 2014-10-24T06:25:02Z

Ok so go for IProjectItem.
@JoshVarty ok let's rename it

@AmadeusW hmm don't get that.. In which situation the version with which you created the model don't match the version in which you are consuming it?

AmadeusW · 2014-10-24T14:41:17Z

@yannisgu ah, is the model a transient structure that exists solely for a short period of time between user selects a github repo and we finish processing it?

yannisgu · 2014-10-24T16:10:42Z

Yep, that's what I thought.

So, how to proceed here? This change will probably break everything, so maybe all PRs should be closed first?

Based off the design discussed in #14

JoshVarty · 2014-10-26T19:18:06Z

After implementing the inital design, I think only one question remains.

How should the SymbolLink (formerly TokenLink) represent the referenced type?

If we use fully qualified name, the HTML transformer will need to know where every type is defined (ie. the document and line number, so it can build the correct link).This seems like something that would be beyond the scope of the HTML transformer.

Maybe we include: Fully qualified name, containing document (or document path) and line number?

yannisgu · 2014-10-26T21:01:42Z

See: #14 (comment)

Do you want to tight simplified code structure model and internal links between read tokens within the same model? I have a bit of concerns about that. I think the linking should be done as part of HTML transform pass. It would also allow you to run code analysis on multiple files in parallel, as you don't have to have any knowledge from other files while doing that. It would be just simple transformation between Roslyn syntax tree into your own model.

The linking should be done by the HtmlTransformer. In fact the HtmlTransformer know about the whole model since it transforms the entire model. The code to find the correct document/line should not be to extensive with LINQ.

JoshVarty · 2014-10-26T22:37:09Z

Ok, I'm a little concerned about performance, but we can see once we've gotten there.

I've started rewriting the DocumentWalker to generate the DocumentModels. One more question: How should we represent trivia? Should we have a Leading and Trailing trivia field for each token?

Based off the design discussed in #14

AmadeusW · 2014-11-29T02:54:17Z

Just a reminder: @JoshVarty remember to support folders which are in the repo but come before the .sln file. Currently they are not in the model, and we don't account for them (but we should)

JoshVarty · 2014-12-02T20:10:06Z

What are the opinions on the first three transformers?

I've created a base abstract class AbstractWorkspaceVisitor that all transformers inherit from. It takes care of visiting the projects and folders properly.

Based on their needs transformers may override:

This allows transformers to process only the pieces of the WorkspaceModel that they're interested in. (ie. A search index transformer likely does not need to visit Folders when building its index.)

If everyone is roughly satisfied with this approach, I'll go ahead and close this Issue. Also, I think @MarcinJuraszek was right about immutability. There's no needs for a lot of the WorkspaceModel items to have setters. However, I'll create a separate issue for that.

MarcinJuraszek · 2014-12-02T21:20:39Z

Wow, I think I missed all the work that was happening in the workspaceModel branch ...

Great job guys! I'll take a look into all of this tonight.

JoshVarty added a commit that referenced this issue Oct 26, 2014

Initial skeleton structure of WorkspaceModel

7306e68

Based off the design discussed in #14

JoshVarty added a commit that referenced this issue Nov 10, 2014

Initial skeleton structure of WorkspaceModel

c5d0078

Based off the design discussed in #14

JoshVarty closed this as completed Dec 8, 2014

JoshVarty mentioned this issue Dec 8, 2014

Make model elements immutable #46

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More flexible design for generator #14

More flexible design for generator #14

yannisgu commented Oct 20, 2014

MarcinJuraszek commented Oct 22, 2014

JoshVarty commented Oct 22, 2014

yannisgu commented Oct 22, 2014

MarcinJuraszek commented Oct 22, 2014

yannisgu commented Oct 22, 2014

JoshVarty commented Oct 22, 2014

AmadeusW commented Oct 23, 2014

JoshVarty commented Oct 23, 2014

yannisgu commented Oct 23, 2014

AmadeusW commented Oct 24, 2014

JoshVarty commented Oct 24, 2014

yannisgu commented Oct 24, 2014

AmadeusW commented Oct 24, 2014

yannisgu commented Oct 24, 2014

JoshVarty commented Oct 26, 2014

yannisgu commented Oct 26, 2014

JoshVarty commented Oct 26, 2014

AmadeusW commented Nov 29, 2014

JoshVarty commented Dec 2, 2014

MarcinJuraszek commented Dec 2, 2014

More flexible design for generator #14

More flexible design for generator #14

Comments

yannisgu commented Oct 20, 2014

Proposed high-level design

MarcinJuraszek commented Oct 22, 2014

JoshVarty commented Oct 22, 2014

yannisgu commented Oct 22, 2014

MarcinJuraszek commented Oct 22, 2014

yannisgu commented Oct 22, 2014

JoshVarty commented Oct 22, 2014

AmadeusW commented Oct 23, 2014

JoshVarty commented Oct 23, 2014

yannisgu commented Oct 23, 2014

AmadeusW commented Oct 24, 2014

JoshVarty commented Oct 24, 2014

yannisgu commented Oct 24, 2014

AmadeusW commented Oct 24, 2014

yannisgu commented Oct 24, 2014

JoshVarty commented Oct 26, 2014

yannisgu commented Oct 26, 2014

JoshVarty commented Oct 26, 2014

AmadeusW commented Nov 29, 2014

JoshVarty commented Dec 2, 2014

MarcinJuraszek commented Dec 2, 2014