Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Initial commit. Bringing over work from depricated repo.
- Loading branch information
Showing
40 changed files
with
3,666 additions
and
4 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
.DS_Store | ||
*.pyc |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
Copyright (c) 2014, Robert deCarvalho | ||
All rights reserved. | ||
|
||
Redistribution and use in source and binary forms, with or without | ||
modification, are permitted provided that the following conditions are met: | ||
|
||
1. Redistributions of source code must retain the above copyright notice, this | ||
list of conditions and the following disclaimer. | ||
2. Redistributions in binary form must reproduce the above copyright notice, | ||
this list of conditions and the following disclaimer in the documentation | ||
and/or other materials provided with the distribution. | ||
|
||
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND | ||
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED | ||
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE | ||
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR | ||
ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES | ||
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; | ||
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND | ||
ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT | ||
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS | ||
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. | ||
|
||
The views and conclusions contained in the software and documentation are those | ||
of the authors and should not be interpreted as representing official policies, | ||
either expressed or implied, of the FreeBSD Project. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,109 @@ | ||
pandashells | ||
=========== | ||
PANDASHELLS | ||
=== | ||
|
||
Bringing the power of python-pandas to the shell prompt | ||
Description | ||
------------------------------------------------------------------------------- | ||
The ptools library was written to bring the power of the python scienctific | ||
stack to the unix command-line. This allows well-known and time-tested tools | ||
like grep, awk, sed, etc. to interact seemlessly with the powerful data | ||
manipulation, visualization, and statistical libraries being developed in the | ||
python data-science community. | ||
|
||
|
||
Coming soon. | ||
Installation | ||
-------------------------------------------------------------------------------- | ||
--- master branch | ||
pip install git+https://github.com/robdmc/ptools.git | ||
|
||
--- experimental branch with pandas (very early stage developement | ||
pip install git+https://github.com/robdmc/ptools.git@with_pandas | ||
|
||
|
||
List of tools (run with -h for help, --example to see example) | ||
-------------------------------------------------------------------------------- | ||
p.df Pandas dataframe manipulation of csv files | ||
|
||
|
||
*********** here are some new tools I want | ||
p.lombscargle | ||
p.mcmc 'patsy model' (see if there's an easy way to do this) | ||
Maybe make distribution,params,prior for each variable | ||
p.mcmc 'y ~ x + z' 'x:Normal(mu, sigma)', y:Normal(mu,sigma) | ||
think about defaults here where partials don't have noise | ||
|
||
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ | ||
here are some regression and classification ideas. | ||
|
||
p.regress - statmodels linear regression with full summary output. maybe use --fit to add fit results to df | ||
p.learn.regress_linear | ||
p.learn.regress_ridge | ||
p.learn.regress_tree | ||
p.learn.regress_forest | ||
p.learn.classify.logistic | ||
p.learn.classify.tree | ||
p.learn.classify.forest | ||
p.learn.classify.svm | ||
|
||
Always use patsy language | ||
|
||
the model.pkl files (which can be user-def names) hold the model as well | ||
as the string used to do the fit | ||
|
||
with --fit model.pkl | ||
saves model in model.pkl and displays rms R^2 and cross_val scores | ||
as well as the original string used to do the fit and the type of model | ||
|
||
|
||
with --predict model.pkl | ||
loads model, input and shows _fit variable to the dataframe | ||
with --stats, does same thing, but displays rms and R2 | ||
with --hist shows hist of residuals | ||
with --plot shows fit vs residual | ||
|
||
of course classifiers have their own metrics and maybe have a | ||
--roc that plots the roc curve | ||
|
||
with | ||
--info model.pkl, just shows the model | ||
|
||
with --desc 'my desc' allows you to store a description that will be | ||
displayed with the --info flag | ||
|
||
|
||
|
||
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ | ||
|
||
|
||
|
||
|
||
********** here is list of tools I want to replicate ********************* | ||
p.cov -> covariance between collumns. cols and index have respective names | ||
*p.parallel | ||
*p.plot | ||
*p.geoCode | ||
*p.crypt | ||
p.bar | ||
p.cdf | ||
p.color | ||
p.fft | ||
p.lombscargle | ||
p.hist | ||
p.interp # cat xvals_file | p.interp -r .6 -t <(cat table_file.txt) | ||
p.linspace | ||
p.map | ||
p.mapDots2html | ||
p.mapPoly2html | ||
p.mongoDump | ||
p.normalize | ||
p.pgsql2csv | ||
p.pie | ||
p.rand | ||
p.regress | ||
p.scat | ||
p.server | ||
p.shuffle | ||
p.sigEdit | ||
p.smooth lowess, spline, medianFilter | ||
p.sshKeyPush | ||
p.template | ||
p.utc2local |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,43 @@ | ||
p.regress - statmodels linear regression with full summary output | ||
p.learn.regress_linear | ||
p.learn.regress_ridge | ||
p.learn.regress_tree | ||
p.learn.regress_forest | ||
p.learn.classify.logistic | ||
p.learn.classify.tree | ||
p.learn.classify.forest | ||
p.learn.classify.svm | ||
|
||
Always use patsy language | ||
|
||
the model.pkl files (which can be user-def names) hold the model as well | ||
as the string used to do the fit | ||
|
||
with --fit model.pkl | ||
saves model in model.pkl and displays rms R^2 and cross_val scores | ||
as well as the original string used to do the fit and the type of model | ||
|
||
|
||
with --predict model.pkl | ||
loads model, input and shows _fit variable to the dataframe | ||
with --stats, does same thing, but displays rms and R2 | ||
with --hist shows hist of residuals | ||
with --plot shows fit vs residual | ||
|
||
of course classifiers have their own metrics and maybe have a | ||
--roc that plots the roc curve | ||
|
||
with | ||
--info model.pkl, just shows the model | ||
|
||
with --desc 'my desc' allows you to store a description that will be | ||
displayed with the --info flag | ||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
Empty file.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,52 @@ | ||
#! /usr/bin/env python | ||
|
||
#--- standard library imports | ||
import os | ||
import sys | ||
import argparse | ||
|
||
############# dev only. Comment out for production ###################### | ||
sys.path.append('../..') | ||
########################################################################## | ||
|
||
|
||
from ptools.lib import config_lib | ||
|
||
|
||
if __name__ == '__main__': | ||
|
||
#--- read in the current configuration | ||
default_dict = config_lib.get_config() | ||
|
||
msg = "Need to write this. " | ||
msg += "and write more." | ||
|
||
#--- populate the arg parser with current configuration | ||
parser = argparse.ArgumentParser( | ||
description=msg) | ||
parser.add_argument('--force_defaults', action='store_true', | ||
dest='force_defaults', | ||
help='Force to default settings') | ||
for tup in config_lib.CONFIG_OPTS: | ||
msg = 'opts: '+str(tup[1]) | ||
parser.add_argument('--%s'%tup[0], nargs=1, type=str, | ||
dest=tup[0], metavar='',#default_dict[tup[0]], | ||
default=[default_dict[tup[0]]], choices=tup[1], help=msg) | ||
|
||
#--- parse arguments | ||
args = parser.parse_args() | ||
|
||
#--- set the arguments to the current value of the arg parser | ||
config_dict = {t[0]:t[1][0] for t in args.__dict__.iteritems() | ||
if not t[0] in ['force_defaults']} | ||
|
||
if args.force_defaults: | ||
config_dict = config_lib.DEFAULT_DICT | ||
config_lib.set_config(config_dict) | ||
|
||
print '\n Current Config' | ||
print ' ' + '-'*40 | ||
for k in sorted(config_dict.keys()): | ||
if not k in ['--force_defaults']: | ||
print ' {: <20} {}'.format(k+':', config_dict[k]) | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
#! /usr/bin/env python | ||
|
||
#--- standard library imports | ||
import os | ||
import sys | ||
import argparse | ||
import re | ||
|
||
############# dev only. Comment out for production ###################### | ||
sys.path.append('../..') | ||
########################################################################## | ||
|
||
from ptools.lib import arg_lib | ||
|
||
#============================================================================= | ||
if __name__ == '__main__': | ||
msg = "Encrypt a file with aes-256-cbc as implemented by openssl. " | ||
|
||
#--- read command line arguments | ||
parser = argparse.ArgumentParser( | ||
description=msg) | ||
|
||
arg_lib.addArgs(parser, 'example') | ||
|
||
parser.add_argument('-i', '--inFile', nargs=1, type=str, | ||
required=True, dest='inFile', metavar='inFileName', | ||
help="The input file name") | ||
|
||
parser.add_argument('-o', '--outFile', nargs=1, type=str, | ||
required=True, dest='outFile', metavar='outFileName', | ||
help="The output file name") | ||
|
||
parser.add_argument('-d', '--decrypt', action='store_true', default=False, | ||
dest='decrypt', help='Decrypt the input file into the output file') | ||
|
||
#--- parse arguments | ||
args = parser.parse_args() | ||
|
||
#--- make sure input file exists | ||
if not os.path.isfile(args.inFile[0]): | ||
sys.stderr.write("\n\nCan't find input file\n\n") | ||
sys.exit(1) | ||
|
||
#--- create a dycryption command if requested | ||
if args.decrypt: | ||
cmd = "cat %s | openssl enc -d -aes-256-cbc > %s" % (args.inFile[0], | ||
args.outFile[0]) | ||
#--- otherwise just encrypt | ||
else: | ||
cmd = "cat %s | openssl enc -aes-256-cbc -salt > %s" % (args.inFile[0], | ||
args.outFile[0]) | ||
#--- run the proper openssl command | ||
os.system(cmd) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,93 @@ | ||
#! /usr/bin/env python | ||
|
||
#--- standard library imports | ||
import os | ||
import sys | ||
import argparse | ||
import re | ||
|
||
############# dev only. Comment out for production ###################### | ||
sys.path.append('../..') | ||
########################################################################## | ||
|
||
from ptools.lib import module_checker_lib, arg_lib, io_lib | ||
|
||
#--- import required dependencies | ||
modulesOkay = module_checker_lib.check_for_modules( | ||
[ | ||
'pandas', | ||
'numpy', | ||
'scipy', | ||
'dateutil', | ||
'matplotlib', | ||
]) | ||
if not modulesOkay: | ||
sys.exit(1) | ||
|
||
import pandas as pd | ||
import numpy as np | ||
import scipy as scp | ||
import pylab as pl | ||
from dateutil.parser import parse | ||
import datetime | ||
|
||
#============================================================================= | ||
if __name__ == '__main__': | ||
msg = "Bring pandas manipulation to command line. Input from stdin " | ||
msg += "is placed into a dataframe named 'df'. The output of each " | ||
msg += "specified command must evaluate to a dataframe that will " | ||
msg += "overwrite 'df'. The output of the final command will be sent " | ||
msg += "to stdout. The namespace in which the commands are executed " | ||
msg += "includes pandas as pd, numpy as np, scipy as scp, pylab as pl, " | ||
msg += "dateutil.parser.parse as parse, datetime" | ||
|
||
#--- read command line arguments | ||
parser = argparse.ArgumentParser( | ||
description=msg) | ||
|
||
options = {} | ||
arg_lib.addArgs(parser, 'io_in', 'io_out', 'example') | ||
parser.add_argument("statement", help="Statement to execute", nargs="+") | ||
|
||
#--- parse arguments | ||
args = parser.parse_args() | ||
|
||
#--- get the input dataframe | ||
df = io_lib.df_from_input(args) | ||
|
||
#--- define regex to identify if supplied command is for col assignment | ||
rex_col_cmd = re.compile(r'.*?df\[.+\].*?=') | ||
|
||
#--- define regex to identify plot commands | ||
rex_plot_cmd = re.compile(r'.*(plot|hist)\(.*\).*') | ||
|
||
#--- execute the statements in sequence | ||
for cmd in args.statement: | ||
#--- if this is a column-assignment command, just execute it | ||
if rex_col_cmd.match(cmd): | ||
exec(cmd) | ||
temp = df | ||
#--- if this is a plot command, execute it and quit | ||
elif rex_plot_cmd.match(cmd): | ||
exec(cmd) | ||
pl.show() | ||
sys.exit(0) | ||
|
||
#--- if instead this is a command on the whole frame | ||
else: | ||
#--- put results of command in temp var | ||
cmd = 'temp = {}'.format(cmd) | ||
exec(cmd) | ||
|
||
#--- transform results to dataframe if needed | ||
if isinstance(temp, pd.DataFrame): | ||
df = temp | ||
else: | ||
try: | ||
df = pd.DataFrame(temp) | ||
except pd.core.common.PandasError: | ||
print temp | ||
sys.exit(0) | ||
|
||
#--- write dataframe to output | ||
io_lib.df_to_output(args, df) |
Oops, something went wrong.