Skip to content

Commit

Permalink
Refactor section 1.3.
Browse files Browse the repository at this point in the history
See what you think @jannes-m - I think this reads better now, is less colloquial while being simultaneously more concise.

However the content you added was excellent - I've aimed to clarify the message.
  • Loading branch information
Robinlovelace committed Oct 21, 2017
1 parent 801f98e commit 8a53516
Showing 1 changed file with 23 additions and 19 deletions.
42 changes: 23 additions & 19 deletions 01-introduction.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -137,25 +137,29 @@ For example, the open-source desktop GIS [gvSig](http://www.gvsig.com/en/product
There are also many open source add-on libraries available for Java, including [GeoTools](http://docs.geotools.org/) and the [Java Topology Suite](https://www.locationtech.org/projects/technology.jts).^[Please note, that GEOS is a C++ port of the Java Topology Suite.]
Furthermore, many server-based applications use Java including among others [Geoserver/Geonode](http://geonode.org/), [deegree](http://www.deegree.org/) and [52°North WPS](http://52north.org/communities/geoprocessing/wps/).

Java's object-oriented syntax is similar to C++, however, its memory management is much simpler.
Still, it is rather unforgiving regarding class, object and variable declarations forcing you to think about a well-designed programming structure.
This is especially useful in large projects with thousands of lines of codes placed in numerous files.
Following the *write once, run anywhere* principle, Java is platform-independent (which is unusual for a compiled programming language).
Overall, compiled Java programs have an excellent performance on large-scale systems making them suitable candidates for complex architecture projects such as programming a desktop GIS.
However, Java is probably less suitable for statistical modeling and visualization compared to Python or R.
First and foremost, though you can do data science with Java [@brzustowicz_data_2017], Java offers much fewer statistical libraries especially when compared with R.
Secondly, interpreted languages (such as R and Python) are often easier to write (at the prize of lower performance) than compiled languages (such as Java and C++).
Moreover, interpreted languages allow a faster and interactive (line-by-line) code implementation.
Finally, R's native support of data types such as data frames and matrices is especially advantageous when it comes to analyzing data.

Lastly, we will introduce Python for geocomputation
Many people believe that R and Python are battling for supremacy in the field of data science.
This is accompanied by a partly offensive as much as often rather subjective discussion on what to learn or what to use.
We believe that both languages have their merits, and in the end it is about doing geocomputation and communicating the corresponding results regardless of the chosen software.
Moreover, both languages are object-oriented, and have lots of further things in common.
Learning one language should give you a headstart when choosing to learn the other as well.
R's major advantage is that statisticians wrote hundreds of statistical packages (unmatched by Python) explicitly for other statisticians.
By contrast, Python's major advantage is that it is (unlike R) a multi-purpose language thereby bringing together people from diverse fields which also explains Python's bigger user base compared to R's.
Java's object-oriented syntax is similar to that of C++ but its memory management is simpler.
Java is rather unforgiving regarding class, object and variable declarations, which encourages well-designed programming structure, useful in large projects with thousands of lines of codes placed in numerous files.
Following the *write once, run anywhere* principle, Java is platform-independent (which is unusual for a compiled language) and has excellent performance on large-scale systems.
This makes Java a suitable language for complex architecture projects such as desktop GIS.

Java is less suitable for statistical modeling and visualization compared to Python or R.
Although Java can be used for data science [@brzustowicz_data_2017], it has relatively few statistical libraries, especially compared with R.
Furthermore Java is hard to use interactively.
Interpreted languages (such as R and Python) are better suited for the type of interactive workflow used in many geographic workflows than compiled languages (such as Java and C++).
Unlike Java (and most other languages) R has native support for data frames and matrices, making it especially well suited for (geographic) data analysis.

Python is the final language for geocomputation that deserves attention in this section.
Like R, Python has gained popularity due to the rapid growth of data science [@robinson_impressive_2017].
Both languages are object-oriented, and have lots of further things in common.
Due to their similarities there is much on-line discussion framing the relative merits of each language as a competition, as exemplified by an [infographic](https://www.datacamp.com/community/tutorials/r-or-python-for-data-analysis) by DataCamp titled "DATA SCIENCE WARS: R vs Python", which arguably generates more heat than light.

In practice both languages have their strengths and to some extent which you use is less important than domain of application and the communication of results.
Learning either will provide a head-start in learning the other.
However there are major advantages of R over Python for geocomputation which explains its prominence in this book.
R has unparalled support for statistics, including spatial statistics, with hundreds of packages (unmatched by Python) supporting thousands of statistical methods.

The major advantage of Python that it is a multi-purpose language.
It brings together people from diverse fields, explaining its larger user base compared with R's.
So if you like Python better or you think it better suits your needs (for example because you are also interested in web and GUI development), go for it.
In fact, we often advise our students to start with Python just because the major GIS software packages provide Python libraries that lets the user access its geoalgorithms from the Python command line^[`grass.script` for GRASS (https://grasswiki.osgeo.org/wiki/GRASS_and_Python), `saga-python` for SAGA-GIS (http://saga-python.readthedocs.io/en/latest/), `processing` for QGIS and `arcpy` for ArcGIS.
].
Expand Down

1 comment on commit 8a53516

@jannes-m
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Thanks!

Please sign in to comment.