Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement driver interface for internally supported AD libraries. #5499

Merged
merged 3 commits into from Jul 26, 2018

Conversation

jppelteret
Copy link
Member

This commit adds classes that act as an interface to the drivers of internally supported automatic
differentiation libraries (Adol-C and Sacado).

In a quest to simplify the road to full integration of AD into our library, I have stripped the core of the functions required to interact this these libraries from the (to be submitted) helper functions. As a result, this is, effectively, the last major commit that requires knowledge of the inner mechanics of these libraries. These classes supplement and leverage the existing Marking and ExtractData classes where possible. Although there's no context for the use of these drivers yet, one may refer to the "basic" Adol-C and Sacado test cases that I added previously to see how they're used.

As an added bonus, with this addition I think that I have completely (or nearly completely) abstracted and safely defaulted the AD-related functions and classes. So it should be possible to move from the current situation where the selection of a type_code that references a library that deal.II is not built against results in a compiler error, to one that presents only a run-time error of sorts :-)

@jppelteret jppelteret added the Automatic differentiation Issues and PRs related to symbolic and automatic differentiation label Nov 18, 2017
@jppelteret jppelteret changed the title Implement driver classes for internally supported AD libraries. Implement driver interface for internally supported AD libraries. Nov 18, 2017
@@ -59,6 +59,9 @@
* A summary of the files that implement the interface to the supported auto-differentiable
* numbers is as follows:
*
* - ad_drivers.h: Provides classes that act as drivers to the interface of internally supported
* automatic differentiation libraries. These are used internally as a intermediary to the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as an intermediary ?

Vector<ScalarType> &values);

/**
* Computes the values of the vector field.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

values -> gradient?

Vector<ScalarType> &values);

/**
* Computes the values of the vector field.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

values -> gradient ?

Assert(independent_variables.size() == n_independent_variables,
ExcDimensionMismatch(independent_variables.size(),n_independent_variables));

double *f = new double();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not double values = 0.; and then give address to it for adolc? Then you don't need cleanup nor temporary variable. Although here you assume that double is castable to scalar_type. What if scalar_type is std::complex<double>?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. My implementation was aligned with what's suggested in their documentation, but you're right that this can be optimised in a few places. I've done so here and where possible below.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW. scalar_type is always double for this class (see typedef above). I've changed the remaining instances of double to scalar_type though to be consistent. Thanks for pointing this out!

ExcMessage("The length of the gradient vector must be equal "
"to the number of independent variables."));

scalar_type *g = new scalar_type[n_independent_variables]; // Use smart pointers or std_cxx11::array?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::vector< scalar_type> g(n_independent_variables)? Then give g.begin() or something to adolc?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

p.s. std::array need compile-time size, which is not the case here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the note was an old todo. std::array definitely wouldn't work here.

ExcMessage("The (symmetric) hessian column length must be equal "
"to the number of independent variables."));

scalar_type **H = new scalar_type*[n_independent_variables]; // Use smart pointers or std_cxx11::array?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not use hessian directly? alignment is exactly the same.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two reasons:

  1. The last parameter to ::hessian must be a double**. There will be no way to safely extract an equivalent data type from FullMatrix.
  2. scalar_type **H is not actually a full matrix, but only the upper triangle of the matrix. Adol-C takes advantage of the symmetry of the 2nd derivative operation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(1) same as above, have you tried assigning addresses of each rows of FullMatrix? (2) that's not an issue, you just need to copy the part from upper triangle before returning

Copy link
Contributor

@davydden davydden Nov 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more precisely, give the address of the first element in each row of FullMatrix

ExcMessage("The length of the dependent function vector must "
" be equal to the number of dependent variables."));

scalar_type *f = new scalar_type[n_dependent_variables]; // Use smart pointers or std_cxx11::array?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

ExcMessage("The hessian column length must be equal "
"to the number of independent variables."));

scalar_type **J = new scalar_type*[n_dependent_variables]; // Use smart pointers or std_cxx11::array?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last parameter to ::jacobian must be a double**. I couldn't find a way to safely extract an equivalent data type from FullMatrix (i.e. everything I tried broke my tests).

@jppelteret
Copy link
Member Author

/run-tests


scalar_type **J = new scalar_type*[n_dependent_variables];
for (unsigned int i=0; i<n_dependent_variables; ++i)
J[i] = new scalar_type[n_independent_variables];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, but can you not just use here a pointer to each row in jacobian (instead of allocating memory)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i mean J[i] = &jacobian(i,0) should work. FullMatrix is row-major so this should be equivalent to what you do here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, the more explicit explanation of what you were inferring was helpful.

#ifdef DEAL_II_WITH_ADOLC
::tapestats(active_tape_index, counts.data());
#else
AssertThrow(false, ExcNotImplemented());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here and below maybe introduce a custom exception with the message along the lines "This function is only available if deal.II is compiled with ADOL-C".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per your suggestion, I've added a custom error message.

Assert(independent_variables.size() == n_independent_variables,
ExcDimensionMismatch(independent_variables.size(),n_independent_variables));
Assert(gradient.size() == n_independent_variables,
ExcMessage("The length of the gradient vector must be equal "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would just do ExcDimensionMismatch as it will give you the actual numbers as opposed to just a message.

Same below in numerous places.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@davydden
Copy link
Contributor

failing test seems to be related

all-headers/differentiation/ad/ad_drivers.h.debug (Failed)

Copy link
Member Author

@jppelteret jppelteret left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your suggestions, @davydden. I've added another couple of functions that I've pointed out to you...

* a different number of fields from which a linearization must be computed.
*/
static void
initialize (const unsigned int &n_independent_variables);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New function

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same question here about how these static member functions are supposed to work: an initialize() function must initialize something, but because the function is static it can't be an object of the surrounding type, and because the argument is const, that's not it either.

Also: no reference necessary. Same below.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose I could rename this to initialize_global_environment or something like that. Would this be better?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think that would make the intent clearer!

*/
template<typename Stream>
static void
print_tape_stats(Stream &stream,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New function

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would there ever be a reason to write to anything other than a std::ostream?

Also, @drwells has tried to work on naming conventions for template arguments, and they typically end in Type -- so maybe StreamType?

is_adolc_number<ADNumberType>::value &&
is_tapeless_ad_number<ADNumberType>::value
>::type
configure_tapeless_mode (const unsigned int n_directional_derivatives)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New function

Copy link
Contributor

@davydden davydden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but let's wait for someone else to review

Assert(counts.size() >= 18, ExcInternalError());
stream
<< "Tape index: " << tape_index << "\n"
<< "Number of independent variables: " << counts[0] << "\n"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::endl ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::endl flushes the write buffer, so its a bit wasteful to do this 18 times in succession. I do a final std::flush at the end.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, i see.


}

#ifdef DEAL_II_WITH_TRILINOS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DEAL_II_WITH_ADOLC?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this is enabled if is_sacado_rad_number (so definitely Trilinos)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, we don't have a separate DEAL_II_WITH_SACADO, meaning that Sacado is required dependency?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, its been this way for a long time. Sometime in the future I'd like to modularise our Trilinos configuration.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks to #6906, I can now add this finer-grained check :-)

@jppelteret
Copy link
Member Author

Thanks for the feedback @davydden!

@davydden
Copy link
Contributor

@jppelteret tester is still not happy

===============================   OUTPUT BEGIN  ===============================
all-headers/differentiation/ad/ad_drivers.h.debug: BUILD failed. Output:
Generating differentiation-ad-ad_drivers.h.debug_interrupt_guard.cc
Scanning dependencies of target differentiation-ad-ad_drivers.h.debug
Building CXX object CMakeFiles/differentiation-ad-ad_drivers.h.debug.dir/test_header.cc.o
In file included from /home/bob/source/tests/all-headers/test_header.cc:17:
/home/bob/source/include/deal.II/differentiation/ad/ad_drivers.h:495:9: error: use of undeclared identifier 'trace_on'
        trace_on(tape_index,keep);
        ^
/home/bob/source/include/deal.II/differentiation/ad/ad_drivers.h:523:42: error: unknown type name 'STAT_SIZE'
        std::vector<std::size_t> counts (STAT_SIZE);
                                         ^
/home/bob/source/include/deal.II/differentiation/ad/ad_drivers.h:523:41: warning: parentheses were disambiguated as a function declaration [-Wvexing-parse]
        std::vector<std::size_t> counts (STAT_SIZE);
                                        ^~~~~~~~~~~
/home/bob/source/include/deal.II/differentiation/ad/ad_drivers.h:523:42: note: add a pair of parentheses to declare a variable
        std::vector<std::size_t> counts (STAT_SIZE);
                                         ^
                                         (         )
/home/bob/source/include/deal.II/differentiation/ad/ad_drivers.h:524:11: error: no member named 'tapestats' in the global namespace
        ::tapestats(tape_index, counts.data());
        ~~^
/home/bob/source/include/deal.II/differentiation/ad/ad_drivers.h:915:9: error: no template named 'is_adolc_number'; did you mean 'is_ad_number'?
      !(is_adolc_number<ADNumberType>::value &&
        ^~~~~~~~~~~~~~~
        is_ad_number
/home/bob/source/include/deal.II/differentiation/ad/ad_number_traits.h:442:12: note: 'is_ad_number' declared here
    struct is_ad_number
           ^
1 warning and 4 errors generated.
make[3]: *** [CMakeFiles/differentiation-ad-ad_drivers.h.debug.dir/test_header.cc.o] Error 1
make[2]: *** [CMakeFiles/differentiation-ad-ad_drivers.h.debug.dir/all] Error 2
make[1]: *** [CMakeFiles/differentiation-ad-ad_drivers.h.debug.build.dir/rule] Error 2
make: *** [differentiation-ad-ad_drivers.h.debug.build] Error 2


all-headers/differentiation/ad/ad_drivers.h.debug: ******    BUILD failed    *******

===============================    OUTPUT END   ===============================

The following tests FAILED:
	245 - all-headers/differentiation/ad/ad_drivers.h.debug (Failed)
Errors while running CTest
test FAILED

@jppelteret
Copy link
Member Author

Erg. Ok, I understand whats going on here. I'll try to fix this during the week.

@jppelteret jppelteret added the WIP label Nov 21, 2017
Copy link
Member

@bangerth bangerth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are a few comments. I think that will also inform what's happening further down below. Let me know when you've worked through these comments and want me to take another look.

DeclException2 (ExcSupportedDerivativeLevels,
std::size_t, std::size_t,
<< "The number of derivative levels that this auto-differentiable number supports is "
<< arg1 << ", but it is required that it supports at least " << arg2 << " levels.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this misunderstandable because of the passive voice in the second half of the sentence. What is "it" in "it is required"? Who requires it? A type? A code location?

Also, is it the a-d number or type that doesn't satisfy the requirement?

*/
template<typename Stream>
static void
print_tape_stats(Stream &stream,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would there ever be a reason to write to anything other than a std::ostream?

Also, @drwells has tried to work on naming conventions for template arguments, and they typically end in Type -- so maybe StreamType?

template<typename Stream>
static void
print_tape_stats(Stream &stream,
const unsigned int &tape_index);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for the reference

*/
static void
enable_taping(const unsigned int &tape_index,
const bool &keep_independent_values);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for the references for these arguments. Same for the next few functions.

* @return The scalar values of the function.
*/
static ScalarType
value (const unsigned int &active_tape_index,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for reference here either. check places below as well.

Vector<ScalarType> &gradient);

/**
* Computes the hessian of the scalar field with respect to all independent
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-> Hessian

* function is recorded.
* @param[in] independent_variables The scalar values of the independent
* variables whose sensitivities were tracked.
* @param[out] gradient The scalar values of the dependent function's gradients.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the gradient variable have to have the correct size already, or is it resized?

Same below for the Hessian matrix.

* @author Jean-Paul Pelteret, 2017
*/
template<typename ADNumberType, typename ScalarType, typename T = void>
struct TapedDrivers
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may just not have read far enough down to see the specializations. But all of the member functions of this class are static, and none actually take any non-trivial arguments. What are they supposed to work on?

I presume that you are in essence defining an interface here, so in specializations of this class, the argument lists are going to consist of elementary data types as well (as opposed to including arguments of type ADNumberType). Are functions in specializations not static then, or how does this work? (And if there is an answer, could it be put into this class's documentation?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fully understand what you find wrong with this, but for now how I can best motivate this interface is to reiterate that this the minimal ("sane") interface to a set of background operations that I have found to consistently support both ADOL-C and Sacado. The user should never end up calling any of these functions directly, and they will be used in the appropriate manner by the high-level helper classes.

For Sacado types, we set up, manipulate and extract data directly from each number. The one exception is here - a call to a global function.

For ADOL-C, this is not the case. They have some global variables lying around that are directly manipulated by some global functions. See, for example, here (taped mode) and and here (tapeless mode). In my mind there are no "elementary" objects to manipulate. They're managed by ADOL-C itself. (You will see later that I have put in place many things to manage their management structures :-)

So in summary, I need some specialisations for some functions based on an AD number type. In some cases these functions don't actually operate on the numbers themselves (rather, they configure the environment) so it would be disingenuous to pass in an AD number as a parameter when its not actually being manipulated. Specialising functions through the templated struct + static function idiom seemed the best compromise for all of the use cases for the "driver" functions.

The hope is that being able to write code like

template<int dim, enum AD::NumberTypes ADNumberTypeCode, typename ScalarType>
    Vector<...>
    ADHelperScalarFunction<...>::compute_gradient()
{
 if (ADNumberTraits<ad_type>::is_taped == true)
        {
          Assert(this->active_tape()!=this->invalid_tape_index,
                 ExcMessage("Invalid tape index"));
          Assert(this->is_recording() == false,
                 ExcMessage("Cannot compute gradient while tape is being recorded."));
          Assert(this->independent_variable_values.size() == this->n_independent_variables(),
                 ExcDimensionMismatch(this->independent_variable_values.size(),this->n_independent_variables()));

          TapedDrivers<ad_type,scalar_type>::gradient(
            this->active_tape(),
            this->independent_variable_values,
            gradient);
        }
      else
        {
          Assert(ADNumberTraits<ad_type>::is_tapeless == true, ExcInternalError());
          Assert(this->independent_variables.size() == this->n_independent_variables(), ExcInternalError());

          TapelessDrivers<ad_type,scalar_type>::gradient(
            this->independent_variables,
            this->dependent_variables,
            gradient);
        }
}

in the helper classes makes obfuscations like

    template<int dim, enum AD::NumberTypes ADNumberTypeCode, typename ScalarType>
    void
    ADHelperBase<...>::ADHelperBase(
      const unsigned int n_independent_variables,
      const unsigned int n_dependent_variables)
      : ...
    {
      // Tapeless mode must be configured before any active live
      // variables are created.
      if (ADNumberTraits<ad_type>::is_tapeless == true)
        {
          TapelessDrivers<ad_type,scalar_type>::initialize(n_independent_variables);
        }
...
}

tolerable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bangerth Do you have any suggestions or recommendations here? I'll rebase and address all of the other comments in the mean time, but it would be great to some idea as to whether I need to address this point and that below...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I understand now what you are trying to do, though I'll admit that it is significantly above my head :-)

The example you gave made sense to me: In essence, you're trying to overload functions based on type, but because some of them manipulate a global environment (not an object passed in), you can't do it via function overloading -- so you do it through specialization of classes. That's a valid approach.

* a different number of fields from which a linearization must be computed.
*/
static void
initialize (const unsigned int &n_independent_variables);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same question here about how these static member functions are supposed to work: an initialize() function must initialize something, but because the function is static it can't be an object of the surrounding type, and because the argument is const, that's not it either.

Also: no reference necessary. Same below.

@bangerth
Copy link
Member

bangerth commented Jul 5, 2018

@jppelteret -- ping?

@jppelteret
Copy link
Member Author

@bangerth Thanks for the reminder. Let me try to re-engage with you on this PR!

@jppelteret
Copy link
Member Author

@bangerth @davydden I think that I've attended to all of the recommendations that you've made. Unfortunately ADOL-C decided to make some breaking changes between the time that this PR was initially submitted and now, so I've included some corrections for those (see the the build system changes and here). I also added some new types here which will be useful later on.

Copy link
Member

@bangerth bangerth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More later.

@@ -408,6 +408,9 @@
* A summary of the files that implement the interface to the supported auto-differentiable
* numbers is as follows:
*
* - ad_drivers.h: Provides classes that act as drivers to the interface of internally supported
* automatic differentiation libraries. These are used internally as an intermediary to the
* the helper classes that we provide.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the the

@@ -0,0 +1,4 @@
New: Classes that act as an interface to the drivers of internally supported automatic
differentiation libraries (Adol-C and Sacado) have been implemented.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth referencing the AD doxygen module here? We run the changelog files through doxygen, so all of the usual markup commands and references work.

* Typedef for tape indices. ADOL-C uses short integers, so
* we restrict outselves to similar types.
*/
using tape_index_type = unsigned short;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not realize this earlier. This is a pretty generic name used for a pretty specific thing. Would it be possible to have these in a nested namespace types::AD, for example?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or better, AD::types maybe?

Also, since it's in a namespace types, the suffix _type is not necessary for any of these.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'll put this into a Differentiation::AD::types namespace.

/**
* Typedef for tape buffer sizes.
*/
using tape_buffer_type = unsigned int;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe reflect sizes in the type name as well?

#ifdef DEAL_II_WITH_ADOLC
// Note: This value is a limitation of ADOL-C, and not something that we
// have control over. See test adolc/helper_tape_index_01.cc for verification
// that we cannot use of exceed this value. This value is defined as TBUFNUM;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

of -> or

static void
jacobian(const types::tape_index_type active_tape_index,
const unsigned int & n_dependent_variables,
const std::vector<ScalarType> &independent_variables,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

I'd almost prefer it if we called this function gradient as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or gradients, if you want.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to give two arguments as to why I'd like to keep this naming. First, there seems to be a genuine mathematical distinction between these two concepts. See here and here for example. Secondly, it mimics the function calls used for ADOL-C directly (they distinguish between gradient and Jacobian with the same logic as mentioned in the first point).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Maybe drop the name "gradient" in the documentation of this function, and mention the jacobian() function in the documentation of the gradient() function as a cross-reference?

*
* @tparam ADNumberType A type corresponding to a supported
* auto-differentiable number.
* @tparam ScalarType A real or complex floating point number.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect that many of the same comments from above apply here as well

* @warning For ADOL-C tapeless numbers, the value given to @p n_independent_variables
* should be the <b>maximum</b> number of independent variables
* that will be used in the entire duration of the simulation. This is
* important in the context of, for example, hp-FEM and for multiple
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

funny indentation

*
* @warning For ADOL-C tapeless numbers, the value given to @p n_independent_variables
* should be the <b>maximum</b> number of independent variables
* that will be used in the entire duration of the simulation. This is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in -> for the entire

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same in line 356

/**
* In the event that the tapeless mode requires <i>a priori</i> knowledge
* of how many directional derivatives might need to be computed, this
* function informs the auto-differentiable library of what this number
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-> differentiation

@jppelteret
Copy link
Member Author

@bangerth Thanks for the second round of feedback. I'll implement your suggestions tomorrow morning :-)

@bangerth
Copy link
Member

As discussed, let's merge this. Can you please squash?

@jppelteret
Copy link
Member Author

@bangerth Is it ok if I just squash the last 4 commits? Or do you want the entire PR squashed into 2 commits (8be6563 + the rest)?

@bangerth
Copy link
Member

Whatever you think is appropriate. I'm just not fond of commits with commit messages such as "Address WB comments" -- they don't stand well on their own :-)

@jppelteret
Copy link
Member Author

@bangerth Done.

@bangerth
Copy link
Member

OK. Ready to merge then.

Let's wait for the indentation checker at least. Do you recall whether all tests passed before your squash?

@jppelteret
Copy link
Member Author

I pushed an update this morning to fix a test that was still broken yesterday, so I guess that I can't say with absolute certainty that this passes the tests. I'll be patient and let the tester run on it once more.

/run-tests

@bangerth
Copy link
Member

We're reasonably confident that this passed all tests, so I merged to unblock all of the follow-ups. If anything should go wrong with the tests after all, then @jppelteret will fix it on short notice.

Copy link
Member

@bangerth bangerth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's my rest of the comments. Please address them in a separate patch if you can!

{
trace_off(active_tape_index); // Slow
std::vector<std::size_t> counts(STAT_SIZE);
::tapestats(active_tape_index, counts.data());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you leave a comment about what this does? it looks like you're just passing an array of zeros (of size STAT_SIZE) here, but what does the function actually do?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or, rather, what do you do with the counts array?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this should be removed (legacy of the non-descriptive tutorials I based this code off of). I'll check to see if I can legitimately remove these lines.

::function(active_tape_index,
1, // Only one dependent variable
independent_variables.size(),
const_cast<double *>(independent_variables.data()),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

groan...

can't these guys be bothered to put a const in their function declarations?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, its frustrating >:-(

}
};

# else
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you copy the condition here as a comment so a reader doesn't have to go back several hundred lines?

// to provide a more descriptive error message if any of its
// static member functions are called.
template <typename ADNumberType>
struct TapedDrivers<
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can't call any of these functions. couldn't you have the same effect by just putting a static_assert(false, "...") into the class itself? Or is there ever a reason to have code that compiles with the non-specialized version but that is not supposed to run?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is to help prevent the user (and us) from having to put ifdef DEAL_II_WITH_ADOLC all over their code. The helper functions will try to use this specialisation of the class if the user has selected the type code type_code == NumberTypes::adolc_taped, and the code will always successfully compile but should never run.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but should never run

... if ADOL-C is not installed.

}
};

# endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copy same condition here

// For reverse-mode Sacado numbers it is necessary to broadcast to
// all independent variables that it is time to compute gradients.
// For one dependent variable one would just need to all
// ADNumberType::Gradcomp(), but since we have a more
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

grammar

// This function checks to see if its legal to increase the maximum
// number of directional derivatives to be considered during calculations.
// If not then it throws an error.
template <typename ADNumberType>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/** ... */

else
{
// So there are some live active variables floating around. Here we
// check if we ask to increase the number of number of computable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

number of number of

{
// So there are some live active variables floating around. Here we
// check if we ask to increase the number of number of computable
// directional derivatives. If this really is necessary then its
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's

// forward mode to compute the first (and, if supported, second)
// derivatives.
template <typename ADNumberType, typename ScalarType>
struct TapelessDrivers<
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/** ... */

@bangerth
Copy link
Member

bangerth commented Aug 1, 2018

Thanks for making these last changes, @jppelteret ! If you put them into a separate PR, this should be mergeable quickly!

@jppelteret jppelteret deleted the ad-drivers branch August 1, 2018 15:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Automatic differentiation Issues and PRs related to symbolic and automatic differentiation Reviewed and ready to merge
Projects
Development

Successfully merging this pull request may close these issues.

None yet

3 participants