Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add library dependency detection #2792

Closed
wants to merge 3 commits into from
Closed

Add library dependency detection #2792

wants to merge 3 commits into from

Conversation

PaulStoffregen
Copy link
Sponsor Contributor

As recently discussed on the mailing list:

https://groups.google.com/a/arduino.cc/forum/#!topic/developers/b9Ckocu3ptw

Solves issue #236

@PaulStoffregen
Copy link
Sponsor Contributor Author

In theory, this should properly deal with #include inside of #ifdef. But only in a library. Sketch preprocessing is not changed. This only applies to detecting dependencies among library.

It should now be possible to do something like this inside a sensor library using I2C

  #ifdef  __AVR_ATtiny85__
  #include "TinyWire.h"
  #else 
  #include "Wire.h"
  #endif

Of course, this hasn't been throughly tested. I know some of the people who've asked me to implement this feature have wanted this. Now's the time to give the test build a try and see if it works for such complex cases.

@matthijskooijman
Copy link
Collaborator

Code looks good to me overall. However, I really think that hardcoding the preproccesor / gcc options in the java code is the wrong way to go, these should be defined by platform.txt. Having said that, I guess this approach is perfectly fine for initial testing, it should just be generalized before merging this to master (but odds are that this was your plan already)

At one point I thought this code could be fooled by a sketch like:

foo.h:

#define FOO

bar.cpp:

#include "foo.h"
#ifndef FOO
#include "bar.h"
#endif

so it would see and process the bar.h include, before the path to foo.h
is added, which removes the bar.h dependency. More generally, you could
imagine a situation where adding an include removes another one, and
removing that other adds the first one again, leading to a loop.

However, in your code you are cleverly always processing the first
missing header file only, which should prevent both of the above
problems from occuring AFAICS. The first missing #include that is found
should always be "correct" in the sense that it can only be influenced
by #includes and #defines above it (which were already resolved) and any
#includes an #defines below it are processed later. So, kudos for this
approach, I think it elegantly solves the problem.

Perhaps a comment explaining this wouldn't hurt, since readers of the
code are likely to notice that just processing one file at a time is
inefficient - explaining why this is needed would be useful.

Finally, having two different ways of checking dependencies (one for
sketches, one for libraries) seems like a bad idea to me. Since this
approach is more powerful in terms of #ifs, it would be great if this
approach could be applied to the sketches as well. I think this should
be fixed before merging to master, though doing it after this code has
received some testing is probably what you intended.

@matthijskooijman
Copy link
Collaborator

Regarding the latter point, my previous pullrequest (#2174) contained a bunch of code cleanups to allow the sketch and libraries to use the same dependency resolution code. I think some of it was included in, or deprecated by the IDE refactor that happened a while ago, but perhaps some of it would still apply. I'll try to have a look at that code and see if I can combine some of that with this PR.

@PaulStoffregen
Copy link
Sponsor Contributor Author

Wow, Matthijs, I'm really impressed how deeply you've analyzed the code!

Yes, comments explaining some of the subtle points would be good. That weird regex was actually one of the final parts that held up completing this little project for a few hours. So much effort for an 8 character string!

You're absolutely right about the "Adding library..." print producing duplicate output. For simple cases like SD requiring SPI, it's just 1 extra line. But it could grow to quite a few extra lines printed in a complex case where a long chain of libraries each depend upon each other. Duplicate printing could be suppressed by adding a List or ArrayList to remember which libraries have already printed and avoid printing them again.

This technique might also be useful for sketch preprocessing. I certainly have thought about that. I would like to respectfully disagree, however, on the best path to take. Incremental improvement in relatively small and achievable steps is almost always be better software development strategy than expanding scope and features. I believe the best course is to first apply this to library dependency, then later consider how (and whether) to reuse it for sketch preprocessing. If we expand the scope of this work and delay merging it, we risk it suffering the same fate as your earlier work in #2174, and numerous other attempts that have been made over the years. Remember, this has been on the issue tracker since 2010, before Arduino was using GitHub.

@PaulStoffregen
Copy link
Sponsor Contributor Author

Shouldn't there be an error message / warning when a .h file was not found anywhere?

I asked myself this question several times while writing the code. I believe the answer is really another question: can we print a better error than the compiler will later report when it can't find the missing header.

@matthijskooijman
Copy link
Collaborator

Duplicate printing could be suppressed by adding a List or ArrayList to remember which libraries have already printed and avoid printing them again.

You can probably fix this by moving these prints up to outside of the loop, since any libraries added inside the loop are already printed once exactly.

I believe the best course is to first apply this to library dependency, then later consider how (and whether) to reuse it for sketch preprocessing.

I guess that's reasonable. I'm hesitant to have a release where there is inconsistency with how dependency detection works between libraries and sketches, but I guess having something that is slightly inconsistent in corner cases is better than having nothing at all :-)

I asked myself this question several times while writing the code. I believe the answer is really another question: can we print a better error than the compiler will later report when it can't find the missing header.

I'm not so sure if the compiler gives a proper error message right now, due to warnings being disabled. Even if that is not true or fixed, I think the compiler might give error message for multiple missing includes. Since we only tried to locate the first one and gave up entirely when that didn't work, I think the compiler might show error message for subsequent missing header files that could actually have been found if we only tried. I'm afraid that might confuse users. This is something to actually try, though, but I don't have time for that right now.

@PaulStoffregen
Copy link
Sponsor Contributor Author

However, it does seem this pull request and #2729 are on a collision course!

Maybe this is a good time to think about whether that approach or a "gcc -M" approach is best for Arduino in the long term?

@matthijskooijman
Copy link
Collaborator

I'm not sure if these two collide: #2729 does preprocessing to find and forward-declare functions, this PR processes to find (missing) include files. I really think the -M approach is the best way forward - it allows reusing the existing preprocessor, instead of trying to emulate one (using regexes, ctags or whatever).

Ideally, you'd also do this for function declarations, but that's probably not feasible, hence #2729 tries to fix this in a best-effort way.

AFAICS, these two approaches solve different problems and can and should just coexist.

@PaulStoffregen
Copy link
Sponsor Contributor Author

This and #2729 absolutely do conflict. It completely replaces that part of Compiler.java with another abstraction layer called "chainRings".

@PaulStoffregen
Copy link
Sponsor Contributor Author

I'm reading the code now. It seems the code moved to preproc/IncludesToIncludeFolders.java.

I have mixed feelings about this coding style. It seems every change to Arduino is adding more and more unnecessary abstraction. Really, what's the point to adding a PreprocessorChainRing class, which merely is used to call a bunch of functions? Why can't those just be ordinary classes with ordinary functions? The only "benefit" seems to be encapsulating all the results into a "context" list, which might be nice if you love abstraction layers, but it really only reduces the readability of Compiler.java by hiding which of the called functions actually accesses the 2 lists "includeFolders" and "importedLibraries". It may look pretty, but instead of ordinary code it becomes a list of actions with data flow obscured from easy view. You have to dig through many more files to figure out which of those things impacts any particular piece of the final output state used by the rest of Compiler.java. It even breaks simple searching for known variable names, by adding a layer that names them with strings. Maybe Frederico has some larger plans for this abstraction layer, but as it's used now, I can't see how it helps, and it certainly adds yet another unnecessary layer of complexity.

Ok, I'm done ranting now....

So much work has already happened on the coanctags branch that it seems impossible to imagine it's not going to be merged. It's a massively invasive change, in terms of lines modified that will break most other pending pull requests, including a lot of unnecessary white space editing.

Had I looked at this yesterday, I probably would have waited to make this library dependency improvement.

@Lauszus
Copy link
Contributor

Lauszus commented Mar 19, 2015

@PaulStoffregen I just tested this PR with the USB Host shield library which depends on the SPI library and it works fine :)

@ffissore
Copy link
Contributor

By encapsulating single responsibilities into single files they become much easier to test. Indeed every single file that implements PreprocessorChainRing has a corresponding test class which requires a minimum amount of setup code. It could have been done in a million other ways of course, but implementing the Command pattern like so make it easier to force a contribution to keep concerns separated.

As for the merging, don't worry. The preprocessor PR is lacking a fundamental feature that will make it break every sketch more complex than a Blink, so it's not going to be merged very soon. On the contrary, yours is receiving positive feedback and it's likely to be merged sooner.
In essence: I'll be the one who will deal with conflicts.

@PaulStoffregen
Copy link
Sponsor Contributor Author

On the plus side, it looks like coan is only called from preproc/IncludesFinder.java. Seems like it ought to be pretty straightforward to later switch from "coan" to "gcc -M", if there's any reason to do so.

@PaulStoffregen
Copy link
Sponsor Contributor Author

@ffissore - Ok, I get the need for unit tests. But please, I hope you'll reconsider the approach of "context". Packing all the data away into a list of named objects makes the Arduino source code dramatically more difficult to read and understand. You can't see in Compiler.java which objects are doing things with which data, like any ordinary coding style would make easily apparent. By using this sort of abstraction, you may be closing the door to future contributors by making the learning curve much too steep.

@ffissore
Copy link
Contributor

Ok, what if I use a java bean instead, with getters and setters? This will make it easier to search for "callers of getImportedLibraries"

@matthijskooijman
Copy link
Collaborator

This and #2729 absolutely do conflict. It completely replaces that part of Compiler.java with another abstraction layer called "chainRings".

Right, of course. I previously meant that these two changes can conceptually co-exist, I hadn't looked at wether the actual implmentations would clash (which they clearly do).

@PaulStoffregen
Copy link
Sponsor Contributor Author

Like you said, there's a million ways to do something.

If you're willing to do more on this, please try to imagine a moderately competent (but far from expert - a typical open source would-be contributor) Java programmer will perceive when the read Compiler.java. In the coanctags branch, first there's a list of object instances that look like function calls. None return data, or take inputs references that appear to be their outputs. Then generic "context" is created and they're all called. Afterwards, 2 items are taken out of the context list.

From an abstraction point of view, it's beautiful. From a code readability point of view, Compiler.java, one of the most important files in Arduino's source, becomes much less readable.

Ordinary, highly readable code defines the variables first. Ordinary code passes variables into functions, or creates objects, usually putting data into them via their constructor or by functions shortly after they're created. Readable code, read top to bottom, shows both the order actions AND gives a strong sense of what data is being manipulated, as it's passed from function to function. The point is you can see what's happening by simply reading ONE higher-level file's code, like Compiler.java.

I know you're juggling a lot of priorities. I understand you probably don't have extra time to rework stuff like this. But if you can, I hope you'll consider how all these many abstraction layers are creating a steeper and steeper learning curve for potential new contributors. Unreadable code that doesn't show data flow without digging through many other files and understanding numerous abstraction classes is a sure way to close to door to most future contributors.

@systronix
Copy link

What Paul said about readability and keeping foremost the priorities of functionality and maintainability. Hard to maintain code will wither and die, and/or become a real burden for the few who can understand it. I'm sure that's not the intent!

@PaulStoffregen
Copy link
Sponsor Contributor Author

I've added a couple more commits, for the comments Matthijs suggested and the duplicate printing issue.

@Coding-Badly
Copy link

I believe I have a test case that fails (assuming I understand how the new code should work). What's the best way to get you the files / instructions? Email?

Is there a way to attach binaries to one of these comments? (Nope: "Unfortunately, we don't support that file type. Try again with a PNG, GIF, or JPG.")

@Coding-Badly
Copy link

Huh. Well. Maybe not a "failure". If there is no CPP file with a library, headers from other libraries are not automatically found. But, I believe, it was established that a library with no CPP (or C) file is out of bounds.

@PaulStoffregen
Copy link
Sponsor Contributor Author

You could create a repository here on github and commit the files. Or you could email them directly to me, paul at pjrc dot com. I'll give them a try and investigate.

@Coding-Badly
Copy link

I will email files and instructions but, at this point, you looking at it seems like a waste of your time. It really is a matter of has-CPP versus does-not-have-CPP.

@PaulStoffregen
Copy link
Sponsor Contributor Author

It's supposed to work the .c and even .S files. I'll be happy to take a look when I have the files. Better to discover any problems or unusual cases now, rather than after it's in the hands of many thousands of end users!

@Coding-Badly
Copy link

Yup. It works if there is a C or a CPP file. If there are only H files it fails.

(Email sent.)

@matthijskooijman
Copy link
Collaborator

@Coding-Badly, if there are only .h files, you will be including them from your sketch. This doesn't currently pull in any recursive dependencies, but once we also apply this preprocessor-based approach to sketch dependencies, I thin that case will be covered as well (when processing the sketch, which includes the library's .h file, any other missing include paths should show up there).

@PaulStoffregen
Copy link
Sponsor Contributor Author

@Coding-Badly, I'm looking at your files now.

Even though you've got 2 libraries containing only header files, where one includes the other's headers, from a code dependency point of view this isn't actually a case where one library depends on the other.

The sketch depends on both these libraries, even though it includes the header from only one of them. The "ProcessorTest" library can't depend on anything, since it contains no files that are actually cause the Compiler.java to invoke gcc. The sketch causes the actual run of gcc which depends on header files from both libraries.

We casually talk about one library or the sketch depending upon another file or library. But sketches and libraries and even files are not the thing that really depends on other things, despite #include syntax. The runs of "gcc -c" are what actually depends on files. Speaking casually but accurately, a library depends on something else if any of the gcc instances compiling its code need any of those files.

When/if we apply this "gcc -M" approach to sketches, I'm confident it will properly handle this case. When "gcc -M" is run on the .cpp file that's made from your .ino file, it'll follow the #include "ProcessorTest.h" into all those headers and discover ProcessorTest_m328_16.h includes PaulTest.h. It will produce a list of all the headers needed, for the set of #ifdef matching your code actually does. Then the lookups on importToLibraryTable will resolve both of those headers and put the 2 libraries onto the "importedLibraries" list.

@PaulStoffregen
Copy link
Sponsor Contributor Author

As a quick test, I ran the "gcc -M" commands on the .cpp file produced from the .ino.

Here's the first iteration command, before anything is known:

/home/paul/teensy/arduino-1.6.1/hardware/tools/avr/bin/avr-g++ -E -M -MG -g -Os -w -fno-exceptions -ffunction-sections -fdata-sections -fno-threadsafe-statics -mmcu=atmega328p -DF_CPU=16000000L -DARDUINO=10601 -DARDUINO_AVR_UNO -DARDUINO_ARCH_AVR -I/home/paul/teensy/arduino-1.6.1/hardware/arduino/avr/cores/arduino -I/home/paul/teensy/arduino-1.6.1/hardware/arduino/avr/variants/standard  /tmp/build2813178344378111191.tmp/PR-2792.cpp

   PR-2792.o: /tmp/build2813178344378111191.tmp/PR-2792.cpp ProcessorTest.h \
   /home/paul/teensy/arduino-1.6.1/hardware/arduino/avr/cores/arduino/Arduino.h \
   /home/paul/teensy/arduino-1.6.1/hardware/tools/avr/avr/include/stdlib.h \
   /home/paul/teensy/arduino-1.6.1/hardware/tools/avr/lib/gcc/avr/4.8.1/include/stddef.h \
   (snip)

On the first line, you can see it detected "ProcessorTest.h" is needed. This is one of the cases where gcc puts more than 1 file on the same line, delimited by spaces, which is why I had to create that weird regex on line 1033 in Compiler.java.

Here's the 2nd iteration command, with ProcessorTest added to the include paths:

/home/paul/teensy/arduino-1.6.1/hardware/tools/avr/bin/avr-g++ -E -M -MG -g -Os -w -fno-exceptions -ffunction-sections -fdata-sections -fno-threadsafe-statics -mmcu=atmega328p -DF_CPU=16000000L -DARDUINO=10601 -DARDUINO_AVR_UNO -DARDUINO_ARCH_AVR -I/home/paul/teensy/arduino-1.6.1/hardware/arduino/avr/cores/arduino -I/home/paul/teensy/arduino-1.6.1/hardware/arduino/avr/variants/standard -I/home/paul/teensy/sketch/libraries/ProcessorTest /tmp/build2813178344378111191.tmp/PR-2792.cpp

   PR-2792.o: /tmp/build2813178344378111191.tmp/PR-2792.cpp \
    /home/paul/teensy/sketch/libraries/ProcessorTest/ProcessorTest.h \
    /home/paul/teensy/sketch/libraries/ProcessorTest/ProcessorTest_m328.h \
    /home/paul/teensy/sketch/libraries/ProcessorTest/ProcessorTest_m328_16.h \
    PaulTest.h \
    /home/paul/teensy/arduino-1.6.1/hardware/arduino/avr/cores/arduino/Arduino.h \
    /home/paul/teensy/arduino-1.6.1/hardware/tools/avr/avr/include/stdlib.h \
    /home/paul/teensy/arduino-1.6.1/hardware/tools/avr/lib/gcc/avr/4.8.1/include/stddef.h \
   (snip)

On this 2nd iteration, it properly detected that "PaulTest.h" is now required. Because there's a library with that header name, it'll be on Arduino's importToLibraryTable list.

Also notice how "gcc -M" did not list the other header files from ProcessorTest which aren't actually used when Arduino Uno is selected.

Here's the third iteration command, with PaulTest added to the include path:

/home/paul/teensy/arduino-1.6.1/hardware/tools/avr/bin/avr-g++ -E -M -MG -g -Os -w -fno-exceptions -ffunction-sections -fdata-sections -fno-threadsafe-statics -mmcu=atmega328p -DF_CPU=16000000L -DARDUINO=10601 -DARDUINO_AVR_UNO -DARDUINO_ARCH_AVR -I/home/paul/teensy/arduino-1.6.1/hardware/arduino/avr/cores/arduino -I/home/paul/teensy/arduino-1.6.1/hardware/arduino/avr/variants/standard -I/home/paul/teensy/sketch/libraries/ProcessorTest -I/home/paul/teensy/sketch/libraries/PaulTest /tmp/build2813178344378111191.tmp/PR-2792.cpp

   PR-2792.o: /tmp/build2813178344378111191.tmp/PR-2792.cpp \
    /home/paul/teensy/sketch/libraries/ProcessorTest/ProcessorTest.h \
    /home/paul/teensy/sketch/libraries/ProcessorTest/ProcessorTest_m328.h \
    /home/paul/teensy/sketch/libraries/ProcessorTest/ProcessorTest_m328_16.h \
    /home/paul/teensy/sketch/libraries/PaulTest/PaulTest.h \
    /home/paul/teensy/arduino-1.6.1/hardware/arduino/avr/cores/arduino/Arduino.h \
    /home/paul/teensy/arduino-1.6.1/hardware/tools/avr/avr/include/stdlib.h \
    /home/paul/teensy/arduino-1.6.1/hardware/tools/avr/lib/gcc/avr/4.8.1/include/stddef.h \
   (snip)

Now "gcc -M" is able to find every file. This is our indication the searching is over and we've built a complete list of include paths and a complete list of libraries the sketch depends upon.

@PaulStoffregen
Copy link
Sponsor Contributor Author

Whether or not to apply this approach to sketches is a question for Cristian & Federico (and maybe Massimo?) Should "gcc -M" replace the work already done with "coan"? Before anyone does any work in this direction, I think we really need to hear from them.

I'd like to continue moving forward with this on sketches after this is merged for libraries. Issue #236 has been unresolved for nearly 5 years, so the last thing I want to do is expand the scope of this work and add delay for closing #236. I believe it does make sense to apply "gcc -M" to discovering the sketch dependencies, but it also makes sense to develop software as a sequence of incremental improvements with achievable milestones.

@matthijskooijman
Copy link
Collaborator

Agreed with @PaulStoffregen on all points :-)

@KurTell
Copy link

KurTell commented Mar 20, 2015

I tried to compile my testcase with PR-2792-BUILD-231-windows.zip, and it does not compile the sketch which compiles fine with Arduino 1.6.0. No output is given, neither in the DOS (cmd) window nor in the IDE's window. It's not really a helpful comment I guess...

@PaulStoffregen
Copy link
Sponsor Contributor Author

@KurTell - Any chance you could post that sketch, or email it directly to me? (paul at pjrc dot dom)

@mgcrea
Copy link

mgcrea commented Mar 29, 2015

👍 Hope this gets merged soon.

@cmaglie cmaglie added feature request A request to make an enhancement (not a bug fix) Component: IDE user interface The Arduino IDE's user interface Component: IDE The Arduino IDE Component: Compilation Related to compilation of Arduino sketches and removed Component: IDE user interface The Arduino IDE's user interface labels Apr 15, 2015
@ffissore ffissore self-assigned this May 12, 2015
@Testato
Copy link

Testato commented Aug 14, 2015

I have a test case, i write two lib, the first may use or may not use the second lib.
But also if the second lib is not used, the user need install it.
The topic is in italian, but the note of version 2.0 explain this in english
http://forum.arduino.cc/index.php/topic,96163.0.html

@ffissore
Copy link
Contributor

Fixed by arduino-builder, since version 1.0.0-beta8, available with hourly builds http://www.arduino.cc/en/Main/Software#hourly

@ffissore ffissore closed this Sep 18, 2015
@ffissore ffissore added this to the Release 1.6.6 milestone Sep 18, 2015
@PaulStoffregen
Copy link
Sponsor Contributor Author

Not fixed with arduino-builder version 1.0.0-beta9 :(

@ffissore
Copy link
Contributor

Tracking better implementation at arduino/arduino-builder#12

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Compilation Related to compilation of Arduino sketches Component: IDE The Arduino IDE feature request A request to make an enhancement (not a bug fix)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet