Mining GitHub projects to learn about open source software development communities and practices. To view a demo of this project please see http://sqrlab.science.uoit.ca/GitHubMining/
- Xubuntu or some similar Ubuntu variant
- Ruby version 1.9.3
- Apache2
- MySQL version 14.14, distribution 5.5.38
- PHP version 5.5.9
- For required gems (see Required Gems File)
- Java version 1.7.0_51
- Ant version 1.9.3
- Eclipse Luna
- Eclipse ADT plugin Note Eclipse ADT with SDK has issues installing the import plug-in.
- Eclipse metrics plugin, version 1.3.8
- Eclipse Metrics xml Reader
- Eclipse Import tool
- Maven version 2.2.1, used to create eclipse project files
- Python version 2.7.6, required by Eclipse Metrics XML reader
- xvfb-run used for headless execution of metrics collection.
Please see the project setup notes for a more detailed explanation on how to setup the project.
-
Install ruby1.9.3
sudo apt-get install ruby1.9.3
-
Install the required Gems. An example would be:
gem install mysql
At least for scraping since it keeps giving segfaults.
-
Install ruby dev for ruby2.0
sudo apt-get install ruby2.0-dev
-
Install the gems
gem2.0 install mysql gem2.0 install json gem2.0 install github_api
In order to store the data you must use mysql
-
Install MySQL and enter the root user's password
sudo apt-get install mysql-server-5.5 mysql-common mysql-client-5.5
-
Log into the mysql server.
mysql -u <username> -p
-
Create github_data database using the create file.
source ./doc/database.sql
-
Create project_stats database using the create file
source ./doc/stats_database.sql
-
** Note, the following is current development**. Create metrics database using the create file
source ./doc/metrics_db.sql
-
Exit the MySQL server
exit
-
Install Apache2
sudo apt-get install apache2
-
Restart Apache2
sudo /etc/init.d/apache2 restart
-
Install PHP5
sudo apt-get install php5 libapache2-mod-php5 php5-mysql
-
Clone project into
/var/www/html/
or set up virtual site -
Changing the api root url. Open the javascript graphing file and change the following line to point the api folder within the project.
var rootURL = "http://git_data.dom/api";
-
Go to page
http://localhost/GitView/index.php
-
Please note, depending on whether the project is placed in the
/var/www/html/
or is a virtual site the relative paths to the resources may need to change. The paths are currently set up for a virtual server. A set up that places the project directly into/var/www/html/
will require to adjust:- header.php, change the href for css resources.
- footer.php, change the href for the js resources.
Usually this just requires changing a path like:
<link href="../css/smoothness/jquery-ui-1.10.3.custom.min.css" rel="stylesheet"/>
To:
<link href="./css/smoothness/jquery-ui-1.10.3.custom.min.css" rel="stylesheet"/>
Note not required if the project is cloned to /var/www/html/
-
Open
/etc/apache2/sites-enabled/000-default.conf
and add the following to the end of the file.<VirtualHost *:80> ServerAdmin test@git_data.dom DocumentRoot "<project_location>" ServerName git_data.dom ServerAlias git_data.dom ErrorLog "/var/log/apache2/git_data.dom-error_log" CustomLog "/var/log/apache2/git_data.dom-access_log" common <Directory "<project_location>"> DirectoryIndex index.php AddHandler php5-script php Options -Indexes +FollowSymLinks +MultiViews AllowOverride All Order allow,deny allow from all Require all granted </Directory> </VirtualHost>
-
Modify the
<project_location>
field inDocumentRoot
andDirectory
to the location of the project.
-
Apache by default will show the directory listings of the folder for the website. To remove this open
/etc/apache2/sites-enabled/000-default.conf
-
Add in the following (if you followed the steps for creating the virtual site only modify the Options line):
<Directory /var/www/html/GitView/> Options -Indexes +FollowSymLinks +MultiViews AllowOverride all Order allow,deny allow from all </Directory>
-
Now that the directories are not displayed by default we now want to block the directories that are not required. The following is a list of the folders that require r-x permission for the web server to work:
- api
- css
- img
- inc
- js
- src
- templates
-
All other folder's can be removed or have their permissions revoked for both group and other users.
sudo chmod go-rx <folder name>
-
Finally, the two files required in the root directory are:
- add_new.php
- index.php
-
All other files can be deleted or the permissions can be revoked for both group and other users.
sudo chmod go-rx <file name>
This section outlines how to collect and then parse the data to show on the website tool.
Please note this script executed in this section may take a very long time (depending on the size of the project).
-
Run the scraper script on the desired project passing the repository owner and the repository's name as arguments. For example:
bash scraper ACRA acra
Please note this section relies on the completion of the previous section for the same repository. In order to parse ACRA/acra it must first be called with the scraper script.
Please note this script executed in this section may take a very long time (depending on the size of the project).
-
Execute the parser script to actually store the values in the database.
bash parser ACRA acra false
-
Proceed to
http://localhost/GitView/index.php
which should now be displaying the newly parsed project. Note this can be done before the parser is finished since the changes will be visible on the site immediately.
This section outlines how to setup the metrics collecting script.
-
To install Oracle's Java, please follow this guide
-
Install Maven
sudo apt-get install maven2
-
Get Eclipse Luna and extract it to a preferred location.
-
Installing the Metrics plug-in for Eclipse by adding the source:
http://metrics2.sourceforge.net/update/
-
Install Python
sudo apt-get install python2.7
-
Download the Eclipse metrics XML reader
-
Installing the ADT plug-in for Eclipse by adding the source:
https://dl-ssl.google.com/android/eclipse/
-
Re-open eclipse which will prompt you to install the Android SDK.
-
Open the Android SDK Manager
-
Select all the required SDK Platform version. If an older version of the target application used an earlier version of the Android SDK then that version will be required as well. The most flexible method is to install every Android version. Note Downloading and install may take sometime.
-
Clone the repository
-
Follow the instructions on installing
-
Open the metric_compiler script and adjust the following variables:
ECLIPSE_LOCATION
the location where the eclipse binary is located.WORKSPACE
the location of the workspace to use.SCRIPT_WORK_DIR
the location to create temporary files.TEMPLATE_BUILD_FILE_LOCATION
the location of the template build.xml file.XML_CONVERTER_LOCATION
the location of the clone of xml to csv program.
-
Open the metrics_calc.rb script and adjust the following:
project_dir
is the location the project will be cloned to and each commit is checked out.output_dir
is the directory to output the metrics csv files to.log_file
is the directory where the log files would be placed.log
whether to ouput the log file or not.headless
whether to run with xvfb (a virtualized graphical environment) or not.metrics_compiler
the location of the metrics compiler shell script.
-
Execute the script to collect metrics for all stored repositories with:
ruby metrics_calc.rb
-
Alternatively, you can use specifically identify which repository to collect metrics for using:
ruby metrics_calc.rb ACRA acra
-
This can take a very long time and make it harder to use the computer is running on (eclipse will open and take focus and then close).
- Note this can also produce a large number of log and output files so it is wise to direct each of them to separate empty directories.