PathToPostgreSql

Grant Gainey edited this page Dec 12, 2016 · 2 revisions
Clone this wiki locally

For up-to-date information about the PostgreSQL port, see PostgreSQL. This page is now obsolete.


Introduction

Below is a collection of information related to the effort to remove Spacewalk's dependency on Oracle, support PostgreSQL, and ultimately be database agnostic. If you are interested in participating in this effort please feel free to review the projects and tasks below to find something that suits your abilities. If you find something you can assign the task to yourself. Help should be available if you require it on IRC and the mailing list.

All estimates are given in days.

The architecture we hope to achieve:

  • All database communication done via standardized SQL that will run against the major databases.
  • Leverage Java Hibernate mappings whenever possible.
  • Migrate all stored procedures to Java application code. (unless absolutely necessary to keep for performance reasons)
  • If stored procedure was used in the Perl stack, migrate the affected pages to Java/JSP.
  • Implement an RPC API for the Python backend to call in place of the stored procedure. (NOTE: unclear if this will be required)

Most development of PostgreSQL specific code is underway in the postgresql branch in git. For information on how to checkout and work with this branch, see the GitGuide.

Support Intallation of PostgreSQL Schema

Requires modifications to spacewalk-setup code to isolate any Oracle specific code and fork the execution path based on the selected database backend.

Current plan is to have PostgreSQL schema generated from Oracle schema using a simple type template system.

Task Estimate Owner Status
Modify spacewalk-setup to support multiple databases 8 dgoodwin ~70%
Generate PostgreSQL schema from templates - paji done

Port Stored Procedures to Java Application Code

After examining the advantages/disadvantages of migrating stored procedures straight to Java application code (that will theoretically run on multiple databases) it was decided the alternative (porting PL/SQL to PL/pgSQL and maintaining both versions) would be minimally less expensive (if we require strict unit testing of the new code). Considering that we would ultimately like to be database agnostic, the effort to reach the end goal could double and thus we chose to accept the overhead and proceed to obtain the architecture we want.

See also:

TODO: Add documentation for process involved with performing an actual migration of a stored proc. Link to a solid commit that demonstrates the whole process would also be nice.

Estimates below are based on the following:

  • <= 25 lines = small procedure = ~2 days
  • <= 75 lines = medium procedure = ~5 days
  • 75 lines = large procedure = ~10 days

| Procedure | Lines | Estimate | Owner | Status | | --- | --- | --- | --- | --- | --- | | is_user_org_admin | 28 | 5 | dgoodwin | done | | lookup_channel_arch | 15 | 2 | | | | | name_join | 23 | 2 | | | | | rhn_install_org_satellites | 25 | 2 | | | | | rhnhistoryview_pkglist | 42 | 5 | | | | | lookup_package_nevra | 36 | 5 | | | | | rhn_synch_probe_state | 27 | 5 | | | | | new_user_postop | 29 | 5 | | | | | delete_errata | 12 | 2 | | | | | lookup_transaction_package | 51 | 5 | | | | | lookup_feature_type | 16 | 2 | | | | | lookup_config_info | 29 | 5 | | | | | lookup_cf_state | 13 | 2 | | | | | queue_server | 28 | 5 | | | | | lookup_server_arch | 15 | 2 | | | | | delete_server | 181 | 10 | | | | | lookup_sg_type | 16 | 2 | | | | | lookup_snapshot_invalid_reason | 16 | 2 | | | | | rhn_install_satellite | 14 | 2 | | | | | lookup_virt_sub_level | 16 | 2 | | | | | lookup_client_capability | 20 | 2 | | | | | lookup_package_provider | 12 | 2 | | | | | create_first_org | 113 | 10 | | | | | lookup_cve | 20 | 2 | | | | | delete_server_bulk | 199 | 10 | | | | | queue_errata | 9 | 2 | | | | | label_join | 23 | 2 | | | | | truncatecachequeue | 4 | 2 | | | | | lookup_package_name | 23 | 2 | | | | | rhn_clean_current_state | 8 | 2 | dgoodwin | done | | lookup_package_key_type | 12 | 2 | | | | | lookup_tag_name | 19 | 2 | | | | | lookup_source_name | 19 | 2 | | | | | create_new_user | 73 | 5 | | | | | lookup_arch_type | 13 | 2 | | | | | id_join | 23 | 2 | | | | | create_new_org | 98 | 10 | | | | | rhn_prepare_install | 39 | 5 | | | | | create_pxt_session | 16 | 2 | | | | | lookup_first_matching_cf | 35 | 5 | | | | | channel_name_join | 23 | 2 | | | | | lookup_evr | 21 | 2 | | | | | lookup_package_arch | 18 | 2 | | | | | rhnhistoryview_erratalist | 40 | 5 | | | | | delete_channel | 16 | 2 | | | | | lookup_config_filename | 20 | 2 | | | | | lookup_package_delta | 19 | 2 | | | | | pxt_session_cleanup | 25 | 2 | | | | | does_user_have_role | 26 | 5 | | | | | lookup_package_capability | 30 | 5 | | | | | lookup_tag | 20 | 2 | | | | | lookup_erratafile_type | 16 | 2 | | | | | set_ks_session_history_message | 35 | 5 | | | | | is_user_applicant | 28 | 5 | | | | | rhn_server.system_service_level | 16 | 2 | | | | | rhn_server.can_change_base_channel | 15 | 2 | | | | | rhn_server.set_custom_value | 26 | 5 | | | | | rhn_server.bulk_set_custom_value | 19 | 2 | | | | | rhn_server.bulk_snapshot_tag | 31 | 5 | | | | | rhn_server.tag_delete | 24 | 2 | | | | | rhn_server.tag_snapshot | 10 | 2 | | | | | rhn_server.bulk_snapshot | 12 | 2 | | | | | rhn_server.snapshot_server | 89 | 10 | | | | | rhn_server.remove_action | 62 | 5 | | | | | rhn_server.check_user_access | 27 | 5 | | | | | rhn_server.can_server_consume_virt_slot | 25 | 2 | | | | | rhn_server.insert_into_servergroup | 69 | 5 | | | | | rhn_server.insert_into_servergroup_maybe | 25 | 2 | | | | | rhn_server.insert_set_into_servergroup | 26 | 5 | | | | | rhn_server.delete_from_servergroup | 66 | 5 | | | | | rhn_server.delete_set_from_servergroup | 23 | 2 | | | | | rhn_server.clear_servergroup | 11 | 2 | | | | | rhn_server.delete_from_org_servergroups | 14 | 2 | | | | | rhn_server.get_ip_address | 22 | 2 | | | | | rhn_org.find_server_group_by_type | 9 | 2 | | | | | rhn_org.delete_org | 45 | 5 | | | | | rhn_org.delete_user | 120 | 10 | | | | | rhn_exception.lookup_exception | 13 | 2 | | | | | rhn_exception.raise_exception | 11 | 2 | | | | | rhn_exception.raise_exception_val | 13 | 2 | | | | | rhn_package.canonical_name | 12 | 2 | | | | | rhn_package.channel_occupancy_string | 16 | 2 | | | | | rhn_bel.is_org_paid | 10 | 2 | | | | | rhn_bel.lookup_email_state | 11 | 2 | | | | | rhn_quota.recompute_org_quota_used | 24 | 2 | | | | | rhn_quota.get_org_for_config_content | 14 | 2 | | | | | rhn_quota.set_org_quota_total | 20 | 2 | | | | | rhn_quota.update_org_quota | 8 | 2 | | | | | rhn_user.check_role | 18 | 2 | | | | | rhn_user.check_role_implied | 18 | 2 | | | | | rhn_user.get_org_id | 9 | 2 | | | | | rhn_user.find_mailable_address | 69 | 5 | | | | | rhn_user.add_servergroup_perm | 21 | 2 | | | | | rhn_user.remove_servergroup_perm | 18 | 2 | | | | | rhn_user.add_to_usergroup | 18 | 2 | | | | | rhn_user.add_users_to_usergroups | 13 | 2 | | | | | rhn_user.remove_from_usergroup | 23 | 2 | | | | | rhn_user.remove_users_from_servergroups | 14 | 2 | | | | | rhn_entitlements.remove_org_entitlements | 35 | 5 | | | | | rhn_entitlements.entitlement_grants_service | 26 | 5 | | | | | rhn_entitlements.lookup_entitlement_group | 19 | 2 | | | | | rhn_entitlements.create_entitlement_group | 19 | 2 | | | | | rhn_entitlements.can_entitle_server | 55 | 5 | | | | | rhn_entitlements.can_switch_base | 23 | 2 | | | | | rhn_entitlements.find_compatible_sg | 29 | 5 | | | | | rhn_entitlements.entitle_server | 45 | 5 | | | | | rhn_entitlements.remove_server_entitlement | 73 | 5 | | | | | rhn_entitlements.unentitle_server | 41 | 5 | | | | | rhn_entitlements.repoll_virt_guest_entitlements | 109 | 10 | | | |

rhn_entitlements.get_server_entitlement 26 5
rhn_entitlements.modify_org_service 127 10
rhn_entitlements.set_customer_enterprise 5 2
rhn_entitlements.set_customer_provisioning 5 2
rhn_entitlements.set_customer_monitoring 5 2
rhn_entitlements.set_customer_nonlinux 5 2
rhn_entitlements.unset_customer_enterprise 5 2
rhn_entitlements.unset_customer_provisioning 5 2
rhn_entitlements.unset_customer_monitoring 5 2
rhn_entitlements.unset_customer_nonlinux 5 2
rhn_entitlements.prune_group 81 10
rhn_entitlements.assign_system_entitlement 85 10
rhn_entitlements.assign_channel_entitlement 62 5
rhn_entitlements.activate_system_entitlement 41 5
rhn_entitlements.activate_channel_entitlement 40 5
rhn_entitlements.set_group_count 58 5
rhn_entitlements.prune_family 56 5
rhn_entitlements.set_family_count 63 5
rhn_entitlements.entitle_last_modified_servers 36 5
rhn_entitlements.prune_everything 50 5
rhn_entitlements.subscribe_newest_servers 37 5
rhn_config.prune_org_configs 6 2
rhn_config.insert_revision 30 5
rhn_config.delete_revision 70 5
rhn_config.get_latest_revision 13 2
rhn_config.insert_file 20 2
rhn_config.delete_file 19 2
rhn_config.insert_channel 25 2
rhn_config.delete_channel 13 2
rhn_config_channel.action_diff_revision_status 45 5
rhn_config_channel.get_user_chan_access 47 5
rhn_config_channel.get_user_revision_access 16 2
rhn_config_channel.get_user_file_access 15 2
rhn_cache.update_perms_for_server 10 2
rhn_cache.update_perms_for_user 23 2
rhn_cache.update_perms_for_server_group 24 2
rhn_channel.get_license_path 14 2
rhn_channel.license_consent 15 2
rhn_channel.subscribe_server 98 10
rhn_channel.can_server_consume_virt_channl 25 2
rhn_channel.bulk_subscribe_server 7 2
rhn_channel.bulk_server_base_change 11 2
rhn_channel.bulk_server_basechange_from 42 5
rhn_channel.bulk_guess_server_base 19 2
rhn_channel.guess_server_base 16 2
rhn_channel.normalize_server_arch 15 2
rhn_channel.base_channel_for_release_arch 20 2
rhn_channel.base_channel_rel_archid 40 5
rhn_channel.bulk_guess_server_base_from 25 2
rhn_channel.clear_subscriptions 16 2
rhn_channel.unsubscribe_server 83 10
rhn_channel.bulk_unsubscribe_server 7 2
rhn_channel.family_for_channel 13 2
rhn_channel.available_family_subscriptions 33 5
rhn_channel.channel_family_current_members 22 2
rhn_channel.update_family_counts 10 2
rhn_channel.available_chan_subscriptions 11 2
rhn_channel.entitle_customer 27 5
rhn_channel.set_family_maxmembers 25 2
rhn_channel.unsubscribe_server_from_family 11 2
rhn_channel.get_org_id 9 2
rhn_channel.get_cfam_org_access 12 2
rhn_channel.get_org_access 17 2
rhn_channel.user_role_check_debug 35 5
rhn_channel.user_role_check 6 2
rhn_channel.loose_user_role_check 8 2
rhn_channel.direct_user_role_check 17 2
rhn_channel.org_channel_setting 17 2
rhn_channel.channel_priority 77 10
rhn_channel.delete_server_channels 34 5
rhn_channel.refresh_newest_package 17 2
rhn_channel.update_channel 27 5
rhn_channel.update_channels_by_package 11 2
rhn_channel.update_channels_by_errata 12 2
rpm.isdigit 10 2
rpm.isalpha 11 2
rpm.isalphanum 12 2
rpm.rpmstrcmp 109 10
rpm.vercmp 28 5
rpm.vercmpcounter 5 2
rpm.vercmpresetcounter 9 2
rhn_date_manip.get_reporting_period_start 17 2
rhn_date_manip.get_reporting_period_end 18 2

Total Estimate: 658 days

Migrate Perl Pages Which Use Stored Procedures

Over 200 perl pages remain. The effort in eliminating them all would be too much additional overhead so the current plan is to identify which perl pages actually use stored procedures, and if the number is managable, port only these to application code.

TODO: Add similar table here listing perl pages that require porting.

Replace Stored Procedure Usage In the Backend

Appears to be about 7 stored procedures used in the Python backend. (look for usage of rhnSQL.Function)

If required, we may wish to port the procedures to application code, but expose them over an RPC API to the Python backend. See the ThriftApiProposal.

However given the small number of procedures involved, other solutions may present themselves when the time comes.

SQL Standardization

Examine the SQL in Hibernate and Datasource queries for anything that may be Oracle specific and remove it.

Potential Hazards

Custom Data Types

Spacewalk uses a custom data type EVR_T for rhnPackageEvr table. This EVR_T data type is mainly used for ordering. Look at evr_t.sql for the definition of this data type. While Postgres has the facility to create custom data type, it does not have a 'ORDER MEMBER FUNCTION' to dictate natural ordering for the custom data type. Here are some example queries that make use of the order member function in Package_queries.xml...

<mode name="system_latest_all_available_packages">
........
   SELECT  p.name_id name_id,
                 p.id,
                 p.evr_id,
                 max(pe.evr) evr,
      from ....
.....
</mode>

<mode name="system_all_available_packages">
........
   SELECT  pn.name AS NAME,....
      from ....
      where ....
...     AND PE2.evr &gt;= full_list.evr)
.....
</mode>

or System_queries.xml

<mode name="system_installed_packages">
...
SELECT ...
  FROM ...
 WHERE ...
ORDER BY PN.name, PE.evr
...
</mode>

To make things a little worse, this EVR_T is used in DB table view definitions that rely on this magical ordering making it difficult to port in application code, for example rhnChannelNewestPackageView.sql

create or replace view
rhnChannelNewestPackageView
as
...
where 
...
and pe.evr = (select	max(sq_pe.evr)
...

PostgreSQL has a way to create User defined Aggregate Functions which has to be explored more and might work well for max, min,and order by have to be worked out... In this case custom methods like evr_max() and evr_min() will have to be created for both oracle and postgres and all the application quesries and view queries have to be replaced with this..

Remove Synonyms

PostgreSQL does not have synonyms which are heavily used in the monitoring code.

No Packages In PostgreSQL

NOTE: This does not appear to be an issue if we are replacing all stored procedures. Will omit this hudle from the tasks section for the time being.

80% of our Oracle stored procedures are in Package and Package Body. Postgres however does not have the concept of package bodies.. So sql queries with calls like rhn_config_channel.get_user_file_access will have issues.. A fix for this issue might be to define Schema for each Package name space and define package functions and procedures in the schema .. For example in the case of rhn_config_channel.get_user_file_access, we will have rhn_config_channel schema and get_user_file_access as a stored procedure in it.. Only issue here is that we will lose ability to store member variables in package body.. But the instances I checked there havent been many packages that have instance variables.

Stored Procedures Referenced Within Schema Itself

There is a certain set of stored procedures within the schema that are referenced from within the database code directly. An example of this would be this usage of a stored procedure call within a view:

create or replace view rhnUserChannel as
select	cfp.user_id		user_id,
	cfp.org_id		org_id,
	cfm.channel_id		channel_id,
	'manage'		role
from	rhnChannelFamilyMembers	cfm,
	rhnUserChannelFamilyPerms cfp
where	cfp.channel_family_id = cfm.channel_family_id
	and rhn_channel.user_role_check(cfm.channel_id,
		cfp.user_id, 'manage') = 1
  ...

These stored procedures can not be extracted out into our RPC layer because they need to be called from within the schema itself. The above need to be either:

  1. Ported to PG/SQL
  2. Inlined into the schema where the procedure is being called from (less desirable, and probably often quite nasty).

| Task | Estimate | Owner | Status | | --- | --- | --- | --- | --- | | Replace EVR_T Class | 15 | | | | | Remove synonyms | | | | | | Remove use of stored procedures within schema | | | | |

Links