Skip to content

Commit

Permalink
Merge pull request #56 from DatenMetzgerX/finalisierung-report
Browse files Browse the repository at this point in the history
Finalisierung Report
  • Loading branch information
Micha Reiser committed May 30, 2016
2 parents 9b0d5ee + 3dc221c commit 9c04226
Show file tree
Hide file tree
Showing 31 changed files with 257 additions and 214 deletions.
18 changes: 9 additions & 9 deletions doc/01_inroduction.tex
Original file line number Diff line number Diff line change
@@ -1,25 +1,25 @@
\section{Introduction}\label{sec:introduction}
The role of JavaScript dramatically changed over the last years. From an unpopular language used to add dynamic effects to web pages to a widely used language with a strong and growing community. It emerged from a browser only language to a general purpose language used to write web-, desktop-, mobile-, and server-applications. This shift is reflected in an increasing complexity and number of JavaScript projects. To tackle the higher complexity, a better tooling support is needed for effective development and refactoring.
The role of JavaScript drastically changed in recent years. From an unpopular language used to add dynamic effects to web pages to a widely used language with a strong and growing community. It emerged from a browser-only language to a general-purpose language used to write web-, desktop-, mobile-, and server-applications. This shift is reflected in an increasing complexity and number of JavaScript projects. To tackle the higher complexity, a better tooling support is needed for effective development and refactoring.

JavaScript does not provide any static analysis for proving the soundness of a program. Type and nullability checking is only performed at runtime. Performing refactorings or adding new functionality is therefore a risky task, as the programer has a very limited tooling for testing if changes have been applied correctly. Static type checking allows to detect common errors like accessing non-declared variables, missing arguments, arguments in incorrect order or invoking a non function type, without the need to execute the program. It therefore is a valuable tool for providing a fast statement about the soundness of a program.
JavaScript does not provide any static analysis to prove the soundness of a program. Type and nullability checking is only performed at runtime. Performing refactorings or adding new functionality is therefore a risky task, as the programmer has a very limited tooling for testing if changes have been applied correctly. Static type checking allows to detect common errors like accessing non-declared variables, missing arguments, arguments in incorrect order, or invoking a non-function type without the need to execute the program. It therefore is a valuable tool for providing a fast statement about the soundness of a program.

JavaScript is a dynamically typed language and therefore requires type inference for type checking. Type inference for JavaScript is a non-trivial task because of the very dynamic nature of JavaScript~\cite{JensenMollerThiemann2009}. JavaScript has several features that makes static program analysis difficult. An explanation of the language features adding complexity to static analysis follows.
JavaScript is a dynamically typed language and therefore requires type inference for type checking. Type inference for JavaScript is a non-trivial task because of its very dynamic nature~\cite{JensenMollerThiemann2009}. JavaScript has several features that makes static program analysis difficult. An explanation of the language features adding complexity to static analysis follows.

\begin{description}
\item[Dynamic Objects] A JavaScript object is a mapping from a string key to a value. Adding new properties or removing existing once can be performed dynamically. A property name can either be a static or computed value. The later is used for dynamic object creation or modification, similar to code using reflection in statically typed languages.
\item[Dynamic Object-Structure] A JavaScript object is a mapping from a string key to a value. Adding new properties or removing existing ones can be performed dynamically. A property name can either be a static or computed value. The latter is used for dynamic object creation or modification, similar to code using reflection in statically typed languages.

\item[Closures] Functions have access to variables from their enclosing scope. Invoking a function requires that the function is evaluated in its declaration scope. Therefore the analysis needs to be context-sensitive.
\item[Closures] Functions have access to variables from their enclosing scope. Invoking a function requires that the function is evaluated in its declaration scope. Therefore, the analysis needs to be context-sensitive.

\item[Side effects] Functions in JavaScript are not pure and therefore invoking a function can have side effects to arguments passed to the function or variables from the outer scope of the function declaration. These side effects can also affect the type of the involved variables, e.g. if the function assigns a value of a different type to an outer scope variable or adds a new property to a parameter. A precise analysis needs to reflect these side effects in the caller's context.

\item[This Binding] The object referenced by \textit{this} depends on the function kind and its usage. Arrow functions capture the \textit{this} of the enclosing context. The \textit{this} inside of a function declarations or expressions depends on the usage. The \textit{this} can explicitly be specified if the function is invoked using \textit{call} or \textit{apply}. Otherwise the binding of \textit{this} is implicitly defined. If the function is called as a method of an object, then \textit{this} is equal to the object, to which the method belongs. If the function is not a member of an object, then \textit{this} references the global object.
\item[\textit{this} Binding] The object referenced by \textit{this} depends on the function kind and its usage. Arrow-functions capture the \textit{this} of the enclosing context. The value referenced by \textit{this} inside of a function declaration or expression depends on the usage. The \textit{this} can explicitly be specified if the function is invoked using \textit{call} or \textit{apply}. Otherwise, the binding of \textit{this} is implicitly defined. If the function is called as a method of an object, then \textit{this} is bound to the object to which the method belongs. If the function is not a member of an object, then \textit{this} is bound to the global object.

\item[Host-Environment] The preliminary host-environment of JavaScript applications is still the browser. However, it is also used for standalone applications or to add scripting functionality to other applications. In this case, NodeJS, Rhino or another JavaScript engine is used as host-environment. Each environment exposes different native objects and methods at runtime, e.g. the browser exposes the DOM-API. A type checker needs to specifically model these objects and methods as their JavaScript code is not existent.
\item[Host-Environment] The preliminary host-environment of JavaScript applications is still the browser. However, it is also used for standalone applications or to add scripting functionality to other applications. In this case, NodeJS~\cite{NodeJS}, Rhino~\cite{Rhino} or another JavaScript engine is used as host-environment. Each environment exposes different native objects and methods at runtime, e.g. the browser exposes the DOM-API. A type checker needs to specifically model these objects and methods as their JavaScript code is not existent.
\end{description}

Implementing a sound type checker for JavaScript risks to be over-restrictive and only allows a very limited subset of JavaScript programs or reports a large amount of false positives. An unsound type system has the disadvantage that it does not detect all errors but allows type checking of a far more complete set of JavaScript programs and therefore has a better chance to be applied in actual practice.
Implementing a sound type checker for JavaScript risks to be over-restrictive and only allows a very limited subset of JavaScript programs or reports a large amount of false positives. An unsound type system has the disadvantage that it does not detect all errors, but allows type checking of a far more complete set of JavaScript programs and therefore has a better chance to be applied in actual practice.

This work defines a sound algorithm that is capable of inferring the types and perform type checking for a majority of the written code. The precision degrades for very dynamic code, e.g. code that uses dynamic object creation or manipulation. The algorithm is unsound for these cases as the inferred types might be imprecise. It is expected that these features are mainly used in frameworks or libraries. This work suggests to substitute type inference for these edge cases by using type annotations. The presented analysis is limited to JavaScript code written in strict mode. Features prohibited in strict mode are not supported by the analysis.
This work defines a sound algorithm that is capable of inferring the types and perform type checking for a majority of the written code. The precision diminishes for reflection-like code, e.g. code that uses dynamic object creation or manipulation. The algorithm is unsound for these cases as the inferred types might be imprecise. It is expected that these features are mainly used in frameworks or libraries. This work suggests to substitute type inference for these edge cases by using type annotations. The presented analysis is limited to JavaScript code written in strict mode. Features prohibited in strict mode are not supported by the analysis.


The first section compares this work with existing tools used to verify or analyze JavaScript programs. \Cref{sec:strict-mode} explains the benefits of restricting the analysis to strict mode and why it is believed that the code written in strict mode will be growing in the near future. The basics of the algorithm are explained in \cref{sec:algorithm}, the results from the evaluation are shown in the preceding section and is followed by the conclusion.
10 changes: 5 additions & 5 deletions doc/02_related_work.tex
Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
\section{Related Work}\label{sec:related-work}
This work is related to linters, transpiled languages and other type checkers for JavaScript.

Linters like ESLint~\cite{jQuery2016} are used to enforce a specific coding style across a project or to find errors using common bug patterns. Linters use simple static analysis techniques for bug identification. These analysis mostly are intra-procedural. This work focuses on errors deducible by type checking, based on a sophisticated inter-procedural analysis.
Linters --- like ESLint~\cite{jQuery2016} --- are used to enforce a specific coding style across a project or to find errors using common bug patterns. Linters use simple static analysis techniques for bug identification. These analyses mostly are intra-procedural. This work focuses on errors deducible by type checking, based on a sophisticated inter-procedural analysis.


An alternative and quite popular approach for type checking programs executing in a JavaScript environment is by transpiling a source language to JavaScript. The source language either allows type inference~\cite{Ekblad2012, McKenna} or uses type annotations. Well-known examples are TypeScript~\cite{Microsoft2012} and Flow~\cite{Facebook2014}. The downside of a transpiled language is the need for an additional build step that slows down the development cycle and is a potential source for errors. A developer using a transpiled language needs to have a good understanding of the source language and JavaScript. This limits the amount of potential programmers or requires additional training. This work differs from transpilers as the focus is on type checking JavaScript and not transpiled languages.
An alternative and quite popular approach for type checking programs executing in a JavaScript environment is by transcompiling a source language to JavaScript. The source language either allows type inference~\cite{Ekblad2012, McKenna} or uses type annotations. Well-known examples are TypeScript~\cite{Microsoft2012} and Flow~\cite{Facebook2014}. The downside of a transpiled language is the need for an additional build step that slows down the development cycle and is a potential source for errors. A developer using a transpiled language needs to have a good understanding of the source language and JavaScript. This limits the amount of potential programmers or requires additional training. This work differs from transpilers as the focus is on type checking JavaScript and not transpiled languages.

TAJS~\cite{JensenMollerThiemann2009} is a sound type analyzer and type checker for JavaScript. The used algorithm is context and path-sensitive. TAJS uses abstract interpretation for type inference. The goal of TAJS is a precise and sound type checker that supports the full JavaScript language. The current version only supports ECMAScript 3 and a limited set of ECMAScript 5. Compared to TAJS, this works focuses on JavaScript programs using ECMAScript 6 and strict mode. Furthermore, the presented analysis is unsound to reduce the number of false positives.
TAJS~\cite{JensenMollerThiemann2009} is a sound type analyzer and type checker for JavaScript. The used algorithm is context- and path-sensitive. It uses abstract interpretation for type inference. The goal of TAJS is a precise and sound type checker that supports the full JavaScript language. If the precision of type inference diminishes, the severity of errors is reduced to warnings. The current version only supports ECMAScript 3 and a limited set of ECMAScript 5. Compared to TAJS, this work focuses on JavaScript programs using ECMAScript 6 and strict mode. Furthermore, this work is unsound to prevent false positives implied by imprecise type inference instead of emitting warnings for them.

Infernu~\cite{Lewis} implements type inference and type checking for JavaScript. It uses the Hindley-Milner algorithm. The used type system only models a limited set of JavaScript. The subset is defined by the properties required by the Hindley-Milner algorithm. For instance, the unmodified Hindley-Milner algorithm requires that a variable has exactly one type in a program, the principal type. Therefore, Infernu disallows assigning values of different types to the same variable, contrary to the JavaScript specification. This works differs from Infernu as it extends the Hindley-Milner algorithm to support a wider set of JavaScript programs and reduce the number of false positives.
Infernu~\cite{Lewis} implements type inference and type checking for JavaScript. It uses the Hindley-Milner algorithm. The used type system only models a limited set of JavaScript. The subset is defined by the properties required by the Hindley-Milner algorithm. For instance, the unmodified Hindley-Milner algorithm requires that a variable has exactly one type in a program, the principal type. Therefore, Infernu disallows assigning values of different types to the same variable, contrary to the JavaScript specification. This work differs from Infernu as it extends the Hindley-Milner algorithm to support a wider set of JavaScript programs and to reduce the number of false positives.

\citeauthor{Odgaard2014} describes in his master thesis a dynamic analysis for type inference~\cite{Odgaard2014}. The presented analysis uses code instrumentation to obtain the runtime values and derives the variable types from these. The inferred types are used to add JSDoc~\cite{JSDoc} type annotations. Compared to the approach presented by the work of \citeauthor{Odgaard2014}, this work uses static over dynamic analysis for type inference.

Tern~\cite{Haverbeke} is an editor-independent JavaScript analyzer with type inference. It only provides an API for editors but does not perform type checking. Editors can use the API of Tern to query type-information, provide auto-completion, and jump-to-definition functionality. Tern uses abstract values and abstract interpretation for type inference. Tern can only infer types for functions with an actual invocation. Non-invoked functions are not analyzed. This work differs from Tern as the project focuses on type checking and not on providing an API for editors.
Tern~\cite{Haverbeke} is an editor-independent JavaScript analyzer with type inference. It provides an API for editors but does not perform type checking. Editors can use the API of Tern to query type-information, provide auto-completion, and jump-to-definition functionality. Tern uses abstract values and abstract interpretation for type inference. Tern can only infer types for functions with an actual invocation. Non-invoked functions are not analyzed. This work differs from Tern, because the project focuses on type checking and not on providing an API for editors.

0 comments on commit 9c04226

Please sign in to comment.