scope and closures: ch7, finishing 'closure lifecycle' section

getify · Feb 15, 2020 · 6d0f2a2 · 6d0f2a2
1 parent b583c62
commit 6d0f2a2
Showing 1 changed file with 125 additions and 14 deletions.
diff --git a/scope-closures/ch7.md b/scope-closures/ch7.md
@@ -431,19 +431,79 @@ Since closure is inherently tied to a function instance, its closure over a vari
 
 If 10 functions all close over the same variable, and over time 9 of these function references are discarded, the remaining function reference still preserves that variable. Once that final function reference is discarded, the last closure over that variable is gone, and the variable itself is GC'd.
 
-This has an important impact on building efficient and performant programs, because closure can unexpectedly prevent the GC of a variable that you're otherwise done with, which leads to run-away memory usage over time.
-
-One question we need to tackle: should we think of closure as applied to just the referenced variable(s) in an inner function, or is closure preserving the entire scope chain with all its variables?
+This has an important impact on building efficient and performant programs, because closure can unexpectedly prevent the GC of a variable that you're otherwise done with, which can lead to run-away memory usage over time. That's why it's important to discard function references (and thus their closures) when they're not needed anymore.
 
 Consider:
 
+```js
+function manageBtnClickEvents(btn) {
+    var clickHandlers = [];
+
+    return function listener(cb){
+        if (cb) {
+            let clickHandler =
+                function onClick(evt){
+                    console.log("clicked!");
+                    cb(evt);
+                };
+            clickHandlers.push(clickHandler);
+            btn.addEventListener(
+                "click",
+                clickHandler
+            );
+        }
+        else {
+            // passing no callback unsubscribes
+            // all click handlers
+            for (let handler of clickHandlers) {
+                btn.removeEventListener(
+                    "click",
+                    handler
+                );
+            }
+
+            clickHandlers = [];
+        }
+    };
+}
+
+// var mySubmitBtn = ..
+var onSubmit = manageBtnClickEvents(mySubmitBtn);
+
+onSubmit(function checkout(evt){
+    // handle checkout
+});
+
+onSubmit(function trackAction(evt){
+    // log action to analytics
+});
+
+// later, unsubscribe all handlers:
+onSubmit();
+```
+
+In this program, the inner `onClick(..)` function holds a closure over the passed in `cb` (the provided event callback). That means that `checkout()` and `trackAction()` function references are held via closure (and cannot be GC'd) for as long as these event handlers are subscribed.
+
+When we call `onSubmit()` with no input, all event handlers are unsubscribed, and the `clickHandlers` array is reset. Once all click handler function references are discarded, all closures and their  `cb` references to `checkout()` and `trackAction()` are discarded.
+
+The call to unsubscribe an event handler when it's no longer needed can be even more important the subscription -- again, considering the overall health and efficiency of the program.
+
+### Per Variable or Per Scope?
+
+Another question we need to tackle: should we think of closure as applied to just the referenced outer variable(s) present in an inner function, or does closure preserve the entire scope chain with all its variables?
+
+In other words, in the previous event subscription snippet, is the inner `onClick(..)` function closed over only `cb`, or is it also closed over `clickHandler`, `clickHandlers`, and `btn`?
+
+Conceptually, closure is **per variable** rather than *per scope*. But the answer is more complicated than that.
+
+Another program to consider:
+
 ```js
 function manageStudentGrades(studentRecords) {
     var grades = studentRecords.map(getGrade);
 
     return addGrade;
 
-
     // ************************
 
     function getGrade(record){
@@ -472,7 +532,7 @@ var addNextGrade = manageStudentGrades([
     { id: 73, name: "Suzy", grade: 87 },
     { id: 112, name: "Frank", grade: 75 },
     // ..many more records..
-    { id: 6, name: "Sarah", 91 }
+    { id: 6, name: "Sarah", grade: 91 }
 ]);
 
 // later
@@ -482,25 +542,76 @@ addNextGrade(68);
 // [ .., .., ... ]
 ```
 
-Let's break down what's happening in this program.
+The outer function `manageStudentGrades(..)` takes a list of student records, and returns a function reference (`addGrade(..)`), which we externally label `addNextGrade(..)`. Each time we call `addNextGrade(..)` with a grade, we get back a current list of the top 10 grades, sorted descending (see `sortAndTrimGradesList()`).
 
-The outer function `manageStudentGrades(..)` takes a list of student records, and returns a function we call `addNextGrade(..)`. Each time we call that function, we pass a grade in, and we get back a current list of the top 10 grades, sorted descending (see `sortAndTrimGradesList()`).
-
-From the end of the original `manageStudentGrades(..)` call and between the multiple `addNextGrade(..)` calls, the `grades` variable has been closed over (inside `addGrade(..)`), so that's how the running list of top grades is maintained. Remember, it's a closure over the variable `grades`, not its value (the array it holds).
+From the end of the original `manageStudentGrades(..)` call, and between the multiple `addNextGrade(..)` calls, the `grades` variable is closed over (inside `addGrade(..)`); that's how the running list of top grades is maintained. Remember, it's a closure over the variable `grades`, not its value (the array it holds).
 
 That's not the only closure involved, however. Can you spot other variables being closed over?
 
-Did you see that `addGrade(..)` references `sortAndTrimGradesList`? That means it's also closed over that variable, which happens to hold the associated function `sortAndTrimGradesList()`. That second inner function has to stay around so that `addGrade(..)` can keep calling it.
+Did you spot that `addGrade(..)` references `sortAndTrimGradesList`? That means it's also closed over that variable, which happens to hold a reference to the `sortAndTrimGradesList()` function. That second inner function has to stay around so that `addGrade(..)` can keep calling it, which also means any variables *it* closes over stick around -- in this case, nothing extra is closed over.
 
 What else is closed over?
 
-What about the `getGrade` variable (and its function)? Is it closed over? It's referenced by the outer scope of `manageStudentGrades(..)`, used by the initial `.map(getGrade)` call. But it's not referenced inside of `addGrade(..)` or `sortAndTrimGradesList()` (both of which themselves are kept alive by closure).
+Examine the `getGrade` variable (and its function); is it closed over? It's referenced in the outer scope of `manageStudentGrades(..)`, used in the initial `.map(getGrade)` call. But it's not referenced inside of `addGrade(..)` or `sortAndTrimGradesList()`.
 
-What about the (potentially) large list of student records we pass in as `studentRecords`? Is that variable being closed over? If it is, then that means that array of student records is never getting GC'd, which leads to this program continuing to consume a larger amount of memory than we might assume. But if we look closely again, none of the inner functions that are preserved via closure are referencing `studentRecords`.
+What about the (potentially) large list of student records we pass in as `studentRecords`, is that variable closed over? If it is, the array of student records is never getting GC'd, which leads to this program holding onto a larger amount of memory than we might assume. But if we look closely again, none of the inner functions that are preserved via closure are referencing `studentRecords`.
 
-Conceptually, closure is **per variable**. So we'd say that since `getGrade` and `studentRecords` are *not* referenced, then they're not closed over, and therefore they are freely available for GC right after the initial `manageStudentGrades(..)` completes.
+According to the *per variable* definition of closure, since `getGrade` and `studentRecords` are *not* referenced in the inner functions, they're not closed over, and are therefore freely available for GC right after the initial `manageStudentGrades(..)` completes.
 
-// TODO
+Indeed, try debugging the above code in a recent JS engine, like v8 in Chrome, placing a breakpoint inside the `addGrade(..)` function. You may notice that the inspector **does not** list the `studentRecords` variable. That's proof, debugging wise anyway, that the engine does not maintain `studentRecords` via closure.
+
+But how reliable is this observation as proof? Consider this (rather contrived!) program:
+
+```js
+function storeStudentInfo(id,name,grade) {
+    return function getInfo(whichValue){
+        // warning:
+        //   using `eval(..)` is a bad idea!
+        var val = eval(whichValue);
+        return val;
+    };
+}
+
+var info = storeStudentInfo(73,"Suzy",87);
+
+info("name");
+// Suzy
+
+info("grade");
+// 87
+```
+
+Notice that the inner function `getInfo(..)` is not explicitly closed over any of `id`, `name`, or `grade` variables. And yet, calls to `info(..)` seem to still be able to access the variables, albeit through use of the `eval(..)` lexical scope cheat (see Chapter 1).
+
+So all the variables were definitely preserved via closure, despite not being explicitly listed in the inner function. Does that disprove the *per variable* notion and prove *per scope* instead? Not exactly, either.
+
+Many modern JS engines do apply an *optimization* that removes any variables from a closure scope that aren't explicitly referenced. However, as we see with `eval(..)`, there are situations where such an optimization cannot be applied, and thus the closure scope continues to contain all its original variables.
+
+Even as recent as a few years ago, many JS engines did not apply this optimization to most programs, so it's possible your websites may still run in such browsers, especially on older or lower end devices. And the fact that it's an optional optimization in the first place, rather than a design guarantee from the specification, means that we shouldn't casually over-assume its applicability.
+
+In cases where a variable holds a large value (like an object or array) and that variable is present in a closure scope, if you don't want that variable and value to be preserved, it's safer (from a memory usage perspective) to manually discard the value rather than relying on the closure optimization and GC.
+
+Let's apply a *fix* to the earlier `manageStudentGrades(..)` example to ensure the potentially large array held in `studentRecords` is not caught up in a closure scope unnecessarily:
+
+```js
+function manageStudentGrades(studentRecords) {
+    var grades = studentRecords.map(getGrade);
+
+    // unset `studentRecords` to prevent unwanted
+    // memory retention in the closure
+    studentRecords = null;
+
+    return addGrade;
+
+    // ..
+}
+```
+
+We're not removing `studentRecords` from the closure scope; that we cannot control. We're simply ensuring that even if `studentRecords` remains in the closure scope, it's not holding onto memory unnecessarily with referencing the potentially large array of data.
+
+We also technically don't need the function `getGrade()` anymore after the `.map(getGrade)` call completes, so we could even possibly eek out a tiny bit more memory by freeing up that reference so its value isn't tied up either. That's likely unnecessary in this example, but broadly this is a technique to be aware of if you're profiling the memory performance of a critical part of your application and find too much being retained.
+
+The takeaway: it's important to know where closures appear in your program, and what variables are (or may be) included. We need to manage our closures so we're only holding onto what's needed and not wasting memory.
 
 ## Why Closure?