Permalink
Browse files

Build system:

- License and ReadMe-Files now included in installation package

Documentation:
- Added reference to Read-Me file and licenses in main documentation file

Misc:
- Updated ReadMe to include incorporated third-party material
- Release notes updated
  • Loading branch information...
1 parent ddefaab commit 72cee9469dd5e7f078b00ee9599620964c435611 Florian Schoppmann committed Feb 9, 2012
Showing with 184 additions and 85 deletions.
  1. +14 −0 CMakeLists.txt
  2. +18 −3 ReadMe.txt
  3. +147 −82 ReleaseNotes.txt
  4. +5 −0 doc/mainpage.dox
View
@@ -164,6 +164,20 @@ if(NOT M4_BINARY)
message(FATAL_ERROR "Cannot find the m4 preprocessor.")
endif(NOT M4_BINARY)
+# -- Install Read-me files and license directory -------------------------------
+
+install(DIRECTORY "${CMAKE_CURRENT_SOURCE_DIR}/license"
+ DESTINATION .
+ COMPONENT core
+ PATTERN ".DS_Store" EXCLUDE
+)
+install(
+ FILES
+ "${CMAKE_CURRENT_SOURCE_DIR}/ReadMe.txt"
+ "${CMAKE_CURRENT_SOURCE_DIR}/ReleaseNotes.txt"
+ DESTINATION doc
+ COMPONENT core
+)
# -- Local includes ------------------------------------------------------------
View
@@ -12,9 +12,24 @@ For installation and contribution guides, please see the MADlib wiki at
https://github.com/madlib/madlib/wiki.
The latest documentation of MADlib modules can be found at http://doc.madlib.net
-or can be accessed directly from your MADlib installation directory by opening
-doc/user/html/index.html file.
+or can be accessed directly from the MADlib installation directory by opening
+doc/user/html/index.html.
+
+Changes between MADlib versions are described in the ReleaseNotes.txt file.
+
+MADlib incorporates material from the following third-party components:
+- Eigen 3.0.3 "is a C++ template library for linear algebra"
+ http://eigen.tuxfamily.org/index.php?title=Main_Page
+- Boost 1.46.1 (or newer) "provides peer-reviewed portable C++ source
+ libraries"
+ http://www.boost.org/
+- doxypy 0.4.2 "is an input filter for Doxygen"
+ http://code.foosel.org/doxypy
+- argparse 1.2.1 "provides an easy, declarative interface for creating command
+ line tools"
+ http://code.google.com/p/argparse/
+- PyYAML 3.10 "is a YAML parser and emitter for Python"
+ http://pyyaml.org/wiki/PyYAML
-Changes between MADlib versions are described in the ReleaseNotes.txt file.
License information regarding MADlib and included 3rd party libraries can be
found inside the license directory.
View
@@ -9,103 +9,168 @@ commit history located at https://github.com/madlib/madlib/commits/master.
Current list of bugs and issues can be found at http://jira.madlib.net.
+--------------------------------------------------------------------------------
+MADlib v0.3
+
+Relase Date: 2012-Feb-9
+
+New features:
+* Installer:
+ - Single installer package targeting all supported DBMSs per OS (MADLIB-218)
+* C++ Abstraction Layer:
+ - Switched from using Armadillo to using Eigen for linear-algebra
+ operations, thereby eliminating the dependency on LAPACK/BLAS (MADLIB-275)
+ - Reimplemented as a template library for performance improvements
+ (MADLIB-295)
+* Decision Trees:
+ - Major update
+ - Now supports multiple split criteria (information gain, gini, gain ratio)
+ - Now supports tree pruning using a validation set to address over fitting
+ - Now supports additional functions for tree output
+ - Now supports continuous features in addition to categorical features
+ - Additional support for handling null values
+ - Improved scalability and performance
+* k-Means Clustering:
+ - Now handles any input that is convertible to SVEC. (MADLIB-42)
+ - Multiple distance functions (L1-norm, L2-norm, cosine similarity, Tanimoto
+ similarity) (MADLIB-43)
+ - Supports multiple seedings methods (kmeans++, random, user-specified list
+ of centroids)
+ - Replaced goodness of fit with the (simplified) Silhouette coefficient
+ (MADLIB-45)
+ - New run-time parameters (MADLIB-47)
+* Linear Regression:
+ - Major speed improvement
+* Logistic Regression:
+ - Major speed improvement
+ - Now handles any input that is convertible to BOOLEAN (dependent variable)
+ or DOUBLE PRECISION[] (independent variables). (MADLIB-283)
+ - An under-/overflow safe version to evaluate the (usual) logistic function,
+ for scoring logistic regression (MADLIB-271)
+ - A third optimizer: Incremental-gradient-descent (MADLIB-303)
+* Support:
+ - For Greenplum <= 4.2.0, added a workaround for INSERT INTO in the same way
+ as the existing CREATE TABLE AS workaround. This workaround is not needed
+ in Greenplum >= 4.2.1 any more. (MADLIB-265)
+ - Function version() returns Madlib build information (MADLIB-309)
+
+Bug fixes:
+Sparse vectors:
+ - Fixed sparse-vector type case problems (MADLIB-282, MADLIB-305)
+ - Fixed a situation where using svec_svf() could cause a segmentation fault
+ (MADLIB-350)
+ - Increased compatibility with internal PostgreSQL conventions (MADLIB-257)
+Logistic regression:
+ - Handle numerical instability more gracefully (MADLIB-343, MADLIB-345)
+ - Handle unexpected inputs more gracefully (MADLIB-284, MADLIB-344)
+ - Fixed "Random variate x is nan, but must be finite" issue (MADLIB-356)
+
+Known issues:
+ - Decision Trees not supported on Greenplum 4.0 (MADLIB-346, MADLIB-347)
+ - K-means: the error '"nan" does not exist' may be raised when input vectors
+ contain NaN. (MADLIB-364)
+ - Association Rules require the madlib schema to be in the search path
+ (MADLIB-353)
+ - Invalid arguments are not always guaranteed to be handled gracefully and
+ may lead to confusing error messages (MADLIB-28, 336, 359, 361, 363, 364)
+
--------------------------------------------------------------------------------
MADlib v0.2.1beta
- Release Date: 2011-Sep-14
-
- General changes:
- * numerous improvements to the C++ abstraction layer:
- - code clean-up
- - fixed issue where incorrect values were returned when used with
- debug builds of PostgreSQL/Greenplum (MADLIB-253)
- - fixed issue where returning arrays to PostgreSQL/Greenplum could lead
- to a crash (MADLIB-250)
- - allocated memory is now 16-byte aligned for improved stability and
- performance (MADLIB-236)
- * compiling with advanced warnings enabled by default now
- * all C/C++ code now free of warnings. On gcc <= 4.6, there might still be
- warnings due to "unclean" macros in DBMS header files (MADLIB-228)
- * prepared Solaris support in a later release (MADLIB-204)
- - added support for Sun Compiler in CMake build script
- - fixed all compilation errors with Sun compiler
- * added UDF to mimic "CREATE TABLE AS ...", as a workaround for a Greenplum
- issue (MADLIB-241). Included this as GP Compatibility module.
- * madpack utility:
- - dropped madpack dependency on PygreSQL (MADLIB-217)
- - improved security in madpack install-check (MADLIB-229)
- - fixed bashism in madpack (MADLIB-222)
- - fixed install-check not running on non-default schema (MADLIB-251)
-
- Modules/methods:
- * SVM (kernel_machines):
- - fixed cumulative error count in svm_cls_update() function
- - improved memory management in SVM module
- * Linear regression (regress):
- - fixed unexpected behavior for some edge cases (MADLIB-214)
- - fixed crashing with huge number of independent vars (MADLIB-250)
- * Logistic regression (regress):
- - added support for arbitrary expressions for dep./indep. variables, not
- just column names (MADLIB-255)
- * Quantile:
- - fixed quantile() function to be exact
- - added simple version for small data sets
- * Sparse Vectors:
- - added check for sorted dictionary to svec_sfv (MADLIB-187)
- * Decision Tree (decision_tree):
- - now can be run multiple times in one session (MADLIB-156)
-
- Known issues:
- * non-unified API for several SQL UDFs (MADLIB-208)
- * performance of the conjugate-gradient optimizer in logistic regression
- can be very poor (MADLIB-164)
+Release Date: 2011-Sep-14
+
+General changes:
+* numerous improvements to the C++ abstraction layer:
+ - code clean-up
+ - fixed issue where incorrect values were returned when used with
+ debug builds of PostgreSQL/Greenplum (MADLIB-253)
+ - fixed issue where returning arrays to PostgreSQL/Greenplum could lead
+ to a crash (MADLIB-250)
+ - allocated memory is now 16-byte aligned for improved stability and
+ performance (MADLIB-236)
+* compiling with advanced warnings enabled by default now
+* all C/C++ code now free of warnings. On gcc <= 4.6, there might still be
+ warnings due to "unclean" macros in DBMS header files (MADLIB-228)
+* prepared Solaris support in a later release (MADLIB-204)
+ - added support for Sun Compiler in CMake build script
+ - fixed all compilation errors with Sun compiler
+* added UDF to mimic "CREATE TABLE AS ...", as a workaround for a Greenplum
+ issue (MADLIB-241). Included this as GP Compatibility module.
+* madpack utility:
+ - dropped madpack dependency on PygreSQL (MADLIB-217)
+ - improved security in madpack install-check (MADLIB-229)
+ - fixed bashism in madpack (MADLIB-222)
+ - fixed install-check not running on non-default schema (MADLIB-251)
+
+Modules/methods:
+* SVM (kernel_machines):
+ - fixed cumulative error count in svm_cls_update() function
+ - improved memory management in SVM module
+* Linear regression (regress):
+ - fixed unexpected behavior for some edge cases (MADLIB-214)
+ - fixed crashing with huge number of independent vars (MADLIB-250)
+* Logistic regression (regress):
+ - added support for arbitrary expressions for dep./indep. variables, not
+ just column names (MADLIB-255)
+* Quantile:
+ - fixed quantile() function to be exact
+ - added simple version for small data sets
+* Sparse Vectors:
+ - added check for sorted dictionary to svec_sfv (MADLIB-187)
+* Decision Tree (decision_tree):
+ - now can be run multiple times in one session (MADLIB-156)
+
+Known issues:
+* non-unified API for several SQL UDFs (MADLIB-208)
+* performance of the conjugate-gradient optimizer in logistic regression
+ can be very poor (MADLIB-164)
--------------------------------------------------------------------------------
MADlib v0.2.0beta
- Release Date: 2011-Jul-8
+Release Date: 2011-Jul-8
- General changes:
- * new build and installation framework based on CMake
- * new C++ abstraction layer for easy and secure method development
- * new database installation utility (madpack)
-
- Modules/methods:
- * new: Association Rules (assoc_rules)
- * new: Array Operators (array_ops)
- * new: Decision Tree (decision_tree)
- * new: Conjugate Gradient (conjugate_gradient)
- * new: Parallel LDA (plda)
- * improved: all methods from previous release
-
- Known issues:
- * non-unified API for several SQL UDFs (MADLIB-208)
- * running decision tree more than once in one session fails (MADLIB-156)
- * performance of the conjugate-gradient optimizer in logistic regression
- can be very poor (MADLIB-164)
- * svec_sfv function doesn't check for sorted dictionary (MADLIB-187)
+General changes:
+* new build and installation framework based on CMake
+* new C++ abstraction layer for easy and secure method development
+* new database installation utility (madpack)
+
+Modules/methods:
+* new: Association Rules (assoc_rules)
+* new: Array Operators (array_ops)
+* new: Decision Tree (decision_tree)
+* new: Conjugate Gradient (conjugate_gradient)
+* new: Parallel LDA (plda)
+* improved: all methods from previous release
+
+Known issues:
+* non-unified API for several SQL UDFs (MADLIB-208)
+* running decision tree more than once in one session fails (MADLIB-156)
+* performance of the conjugate-gradient optimizer in logistic regression
+ can be very poor (MADLIB-164)
+* svec_sfv function doesn't check for sorted dictionary (MADLIB-187)
--------------------------------------------------------------------------------
MADlib v0.1.0alpha
- Release Date: 2011-Jan-31
+Release Date: 2011-Jan-31
- Initial release.
-
- Included modules/methods:
- * Naive-Bayes Classification (bayes)
- * k-Means Clustering (kmeans)
- * Support Vector Machines (kernel_machines)
- * Sketch-based Estimators (sketch)
- * Sketch-based Profile (data_profile)
- * Quantile (quantile)
- * Linear & Logistic Regression (regress)
- * SVD Matrix Factorisation (svdmf)
- * Sparse Vectors (svec)
+Initial release.
+
+Included modules/methods:
+* Naive-Bayes Classification (bayes)
+* k-Means Clustering (kmeans)
+* Support Vector Machines (kernel_machines)
+* Sketch-based Estimators (sketch)
+* Sketch-based Profile (data_profile)
+* Quantile (quantile)
+* Linear & Logistic Regression (regress)
+* SVD Matrix Factorisation (svdmf)
+* Sparse Vectors (svec)
--------------------------------------------------------------------------------
MADlib v0.1.0prerelease
- Release date: 2011-Jan-25
+Release date: 2011-Jan-25
- Demo release.
+Demo release.
View
@@ -10,6 +10,11 @@ Useful links:
<li>MADlib bug reporting site: http://jira.madlib.net/ and quick guide: https://github.com/madlib/madlib/wiki/Bug-reporting</li>
</ul>
+Please refer to the <a href="../ReadMe.txt">Read-Me</a> file for information
+about incorporated third-party material. License information regarding MADlib
+and included third-party libraries can be found inside the
+<a href="../../license">license</a> directory.
+
@defgroup grp_modeling Data Modeling
@defgroup grp_suplearn Supervised Learning

0 comments on commit 72cee94

Please sign in to comment.