# Background

Ruby is a popular programming language. It's widely used as a programming language in Web service and automation. Ruby also have large open source community. The RubyGems.org is where most of the open source Ruby project release their software packages. At the beginning of year 2017, there are about 13908 gems, with 1126308 different versions. A lot of software developers are working on their gems all the time.

With the rich collection of Ruby gems, people can do their job much faster. But depending on an open source library also introduce future risks. Some software packages do simple and clear things. They tend to stablize after a period of time and do not need much active maintenance. But most non-trivial software libraries do need active maintenance to address new problems, adjust to new environment and adding new features. It can be painful when depending on a third party library and then found that library stopped being maintained.

There are 1132 Ruby gems stop having any new versions before 2015, which had been actively maintained for more then 10 months before they stopped. Please find the definition for "actively maintained for more than 10 months" in Process - Labeling the data.

There are also many Ruby gems that have been maintained actively for a long time with many releases. From software maintainability's perspective, they are at least "maintainable" code some releases back from the recent.

There could be different reasons why certain Ruby project stopped, and why some is still continuing. RubyGems.org keeps all the gems versions and most of the gems developers tend to release often. Perhaps we have find behavioral trace in the releases. Ruby libraries are ditributed in the form of source code. If there's a pattern or we can find a model then we can use it to evaluate Ruby source code or learn how to increase software maintainability.

To avoid confusion, I will use the following terms when discribing a software package on RubyGems

* **Gem** is a software project hosted by RubyGems.org. It has a unique name and may have mulitiple versions.
* **version** is one snapshot of the code that has a version number. We will sometimes also call it a "**Release**"
* **package** the collective data related to one version of a gem. This includes the specification and all the code. One thing need to emphasize is a package's file extension is ".gem".


## Process

### Collecting the raw data

The whole RubyGems database can be mirrored using open source tool (https://github.com/rubygems/rubygems-mirror). The whole database is around 300GB as of the beginning of year 2017. It will unzip to terobytes of Ruby source code. Then using the Ruby Gem library (https://github.com/rubygems/gems) we can get all the gems specification and unpack the packages.

`lib/gem_specs.rb` does this job.

#### Versions simplification

To simplify and normalize the data, I keep only one last version for a gem in a month, if there are multiple versions in that month.

After this simplification, I made the assumption that a gem have more than 10 months with versions means they are the kind of software that needs continuous updates.

### Labeling the data

A gem will be labelled as "bad" if it:

* Has more than `maintained_months` months with new versions
* Has no new releases for 2 years
* Has more than `min_nloc` NLOC in its last version

A gem will be labelled as "good" if it:

* Has more than `well_maintained_months` months with new versions
* Has more than `min_nloc` NLOC in its last version

`well_maintained_months` is a much large number than `maintained_month`. And for the data that's labelled as 'good', I remove the last 15 months because we don't know what will happen to them next, but 15 months ago they are "maintainable" software as they have been maintained for at least another 15 months.



Different specifications are used to label data.

---a table--

#### Manual confirmation

Out of 387 stopped and complicated enough gems, 199 listed their Github URL as the homepage for the gem. A lot of the others also use Github as their version control system, but have other web page as homepage. Amoung these gems, 80 of them have open issues on github after the last version was released, which means there are still need not fulfilled for these gems. But they stopped.

Here's an incomplete list for gems having at least ten open issues on github after the last version was released.

| Index | Gem | Open Issues | Years stopped | URL | Possible Reason For Stopping |
|---|----|
| 1 | axlsx | 30 | 3.4 | https://github.com/randym/axlsx | Still actively being developed, but having too many open issues and pull requests. |
| 2 | cancan | 30 | 3.7 | https://github.com/ryanb/cancan | Became very popular when released in 2009 but the author stopp all development activity in 2013. In an effort to keep the gem going, the community forked it and created CanCanCan (https://github.com/CanCanCommunity/cancancan). |
| 3 | chronic | 30 | 3.4 | https://github.com/mojombo/chronic | Nearly no development activity for 3 years. |
| 4 | fakeweb | 30 | 6.4 | https://github.com/chrisk/fakeweb | Just stopped. |
| 5 | fnordmetric | 30 | 3.6 | https://github.com/paulasmuth/fnordmetric | Just stopped. Having over 400 forks but only 6 pull requests. |
| 6 | i18n | 29 | 2.1 | https://github.com/svenfuchs/i18n | Wrong label. Stable popular component. Will have a new release soon. |
| 7 | jsduck | 30 | 3.3 | https://github.com/senchalabs/jsduck | Little development activity. |
| 8 | ooor | 22 | 3.7 | https://github.com/akretion/ooor | Still maintained (38 commits in 2016). But somehow no new release. |
| 9 | redcar | 20 | 5.0 | https://github.com/danlucraft/redcar | Just stopped. |
| 10 | surveyor | 30 | 3.8 | https://github.com/NUBIC/surveyor | Just stopped. |
| 11 | taps | 30 | 4.7 | https://github.com/ricardochimal/taps | Just stopped. |
| 12 | veewee | 30 | 2.3 | https://github.com/jedi4ever/veewee | Code still maintained with about 20 commits each year. With over 800 forks. |
| 13 | webrat | 22 | 6.1 | https://github.com/brynary/webrat | Just stopped. |
| 14 | youtube_it | 22 | 2.5 | https://github.com/kylejginavan/youtube_it | Just stopped. |

Most of the above 14 gems I checked manually stopped in a way that will surprise an outsider who depends on the gem. With below exceptions:

* `i18n` is mistakenly marked as *bad*. It has a new release (January 31, 2017) right after I mirror the RubyGems. It provides internationalization support for Ruby since 2008.
* `ooor` and `veewee` still have some maintainance activity on their Github repositories. But they have publish any new release for more than two years and people have to fork their code to make changes.

After manually check 14 gems, I found 11 of them just stopped i

### Getting the Static Code Analysis

The basic building blocks for the processed data is generated by static code analyzers.

Before making any judgement regarding the usefulness of the matrix, I try to collect as many of them as possible. The efficiency will be check when later doing the simple statistics and machine learning. The conclusions can be found in the part On Static Code Analyzing.

Three open source software tools are employed for the static code analyzing.

#### Lizard

#### Rubocop

#### Reek

### Simple Statistics

### Machine Learning for static-static code analysis

### Machine Learning for dynamic-static code analysis

### Evaluating the current popular Ruby gems

## Software Structure

## On Static Code Analyzing