Provides higher level abstraction for database queries and
devtools::install_github('AZASRS/asrsMethods/asrsMethods') devtools::install_github('AZASRS/AZASRS', force = TRUE)
Ensure that you have received the
.Renviron file and put it into your home folder. Locate your home folder via:
.Renviron file contains all of the environment variables that you need in order to connect to the database. It's important that you have your own
GITHUB_PAT and this will not work if you do not have a blank line at the end of the file (strange requirement, but important).
Please utilize the projects section: https://github.com/AZASRS/AZASRS/projects/1 in order to give your ideas and start planning projects. We can keep track of things here and make sure that we are not duplicating efforts or repeating mistakes. We'll start to develop a process surrounding this that will allow us to keep our workflow integrated and our code nice and clean. Please note, that you should always make a branch to work on, commit your changes regularly with good notes, and make pull requests on a small set of features. Doing anything large-scale will result in merge conflicts and changes will take longer to complete. Consistently fetching new branches and merging into the branch you are working on is also a good practice. Upon finishing a feature in the branch you created, you will make a pull request and then we can talk through it and make changes if necessary. All tests must pass before you can merge changes. Running
testthat::test_dir('tests') will check existing code. You will be required to write tests for any code you will be adding as well in order to ensure we can trust changes moving forward.
Workflow for a contribution to the library:
- Start by going to: https://github.com/AZASRS/AZASRS/projects/1
- On the far left, add a note under wishlist (make it short enough to read)
- If it is complicated, click the three dots
...on your new note and click "Convert to issue". There you can make some more notes and add a use case. You can also assign users so that people know who this affects.
- Keep in mind, most people will not immediately know what the end result you're looking for is, so provide a detailed explanation of what it should do and how it should be calculated.
- If it is possible to add a screenshot of the end result (keep proprietary information in mind) that would be helpful
- Pull the latest
developmentbranch to make sure it is up-to-date
- Checkout your
- Create a new branch from
developmentwith a name that reflects the feature you're working on (i.e.
cash_adjusted_navs). If you are creating a new function, please name the function the same as the file within the
- Make code changes / add new files / etc. -- consider the outcome and create a test by running the command
my_new_functionrepresents the new file/function you want to test. Ensure this test fails if it does not return the proper results.
- Run all tests
testthat::test_dir('tests')-- ensure they pass (the testthat results are above the "Results" section, and show something like: [ OK: 19 | SKIPPED: 0 | WARNINGS: 0 | FAILED: 0 ]) -- if all tests do not pass, figure out the problem and fix it before committing or making a pull request.
- With each change, save, stage, commit often (in small batches, typically one file at a time) with informative commit messages. Small commits are extremely important, and incremental changes must be made in order to avoid merge conflicts.
- Push changes to GitHub
- Make pull request to merge your branch into
- Changes will be reviewed by at least 2 people and then merged if they satisfy all requirements
- On a schedule, all changes should be merged into
developmentand then the changes will be merged into
- This will be followed by a "new release" in which all parties should be notified and everyone should install the newest library with
devtools::install_github('AZASRS/AZASRS', force = TRUE)
- Do not add any .csv or other data that could be seen as proprietary to GitHub
- Never "hard code" any file locations
- When needing to save data, utilize the
saveRDS_test()function. Where you take the variable you want to save and utilize the function name you have created. For example, if I created a function called
my_favorite_functionthat returned a tibble of values, I would then run the following in my console:
my_favorite_function = my_favorite_function(argument1, argument2) saveRDS_test(my_favorite_function)
- This creates an .rds file (so ensure that there is no proprietary information, or very minimal by perhaps writing
%>% head(3)to the end so that very little is available)
- After running
usethis::use_test('my_favorite_function')a file will be created:
./tests/testthat/test-my_favorite_function.Rthat typically is opened up automatically and you can add your test there. A small example is included and please reach out if you need help understanding what to write. If you look at the other examples it may be easy to see what to test. These do not need to be extremely difficult tests but should cover the majority of what you are looking to achieve.
testthat::test_dir('tests')to see if your tests pass! You should also slightly modify it after to ensure it fails in order to know that you are testing something meaningful. But at the end of the day it needs to pass.
These will begin with
get_ and do not require data to be passed.
- Pull all NAV (contains meta data):
- Pull all Cash Flow (contains meta data):
- Pull Fund Info:
- Pull benchmark index:
- Beware that this is a large amount of data and should be filtered
Building high level output
These will begin with
build_ and do not require data to be passed, however, it will likely only be useful if you do so.
- Build all Private Market Metrics:
- Build P2P IRR's:
- Build benchmark future value index factors:
These will begin with
calc_ and require specific inputs. These can be seen on a case by case basis within the documentation.
- Calculate irr:
calc_irr()-- requires data to be passed
Raw data (database tables)
You can pull the raw data from the database tables if you need to. This should be a last resort because schema changes are possible and this would make breaking changes in your reports. These willall begin with
tbl_. If you need to do joining, you should not ever use raw SQL but should instead use these
- Getting Fund Info:
If you are joining these, you will need to utilize a single connection, available, for example:
con = AZASRS_DATABASE_CONNECTION() tbl_pm_fund_nav_daily(con) %>% left_join(tbl_pm_fund_info(con), by = 'pm_fund_info_id')
You can return a
tibble or a database object (shown below).
# Source: lazy query [?? x 27] # Database: Microsoft SQL Server 12.00.2000[dbo@azasrs/asrs]
tibble requires it to actually calculate from the database, be stored in R (RAM) and then be processed. If you utilize the database objects, you may do calculations before it reaches your local computer. The latter method will optimize performance by putting computing power in the database. Many functions utilize
return_tibble = TRUE (or
FALSE) which allows the user to decide.
Utilizing more functionality
You can utilize most of the
dbplyr packages. This allows for the use of
filter, etc. This will speed up your data analysis dramatically by reducing the number of rows and columns you are pulling.
get_pm_nav_daily(return_tibble = FALSE) %>% dplyr::filter(effective_date >= '2019-01-01')
Examples provided in
- Updating the database requires files to be uploaded into Azure Blob Storage. These must be created and updated utilizing the