Skip to content

globeandmail/pb-data-analysis-scripts

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Example code is provided by a community of developers. They are intended to help you get started more quickly, but are not guaranteed to cover all scenarios nor are they supported by Arc XP.

These examples are licensed under the MIT license: THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Reiterated from license above, all code in this example is free to use, and as such, there is NO WARRANTY, SLA or SUPPORT for these examples.


Scripts to analyze pb-data locally

Notes:

  • pb-data is a database snapshot of PageBuilder data, taken every 12 hours.

Requirements

How to use

  1. Make sure you install the dependencies that bsondump and duckdb commands are present in your current bash session.
  2. First, download the pb-data from Arc XP Admin > PageBuilder > Developer Tools > PB Data screen, to your local computer.
  3. Unzip the downloaded tar file, and rename the folder as pb-data and place it in this projects root folder.
  4. Run sh _prepare.sh command to create temp database file (duckdb file) that contains the simplified views that is used in different shell scripts. These views are not copying the actual data, they are just views referring to the actual JSON files converted from mongodb bson files.
  5. Run any of the shell script. These scripts are plain and simple, you can read the code to understand what the parameters, or add -h argument to the scripts to see help text for each one of them (i.e: sh find-pages-by-feature-name.sh -h)

Video Tutorial

Video Tutorial

The video and tutorial can be found in "How to check feature/content-source usage using pb-data analysis scripts" ALC documentation.

Scripts

In terminal, view the below script help with command sh help.sh

📄 describe-page-or-template.sh -i <page_or_template_id>

This script shows meta data of this page (uri, title), list of chains & features, sorted by how many times used in the page/template, and the content sources configured from features.

-i Page or Template ID (required, min 2 characters)

📁 all-chains-usage.sh [-c]

Produces list of all chains used in your pb-data (not bundle), in published pages and templates along with how many pages they are used in and the number of times (instances) they are used in these pages.

-c Output in CSV format

🔍 find-pages-by-chain-name.sh -n <chain_name> [-c]

Produces list of pages and templates which uses a specific chain. You can open pagebuilder editor with the following url template with the page or template id in the query string: https://YOURORG.arcpublishing.com/pagebuilder/editor/curate?p=PAGEID

-n Chain name (required, min 2 characters)

-c Output in CSV format

📁 all-features-usage.sh [-c]

Produces list of all features used in your pb-data (not bundle), in published pages and templates along with how many pages they are used in and the number of times (instances) they are used in these pages.

-c Output in CSV format

🔍 find-pages-by-feature-name.sh -n <feature_name> [-c]

Produces list of pages and templates which uses a specific feature. You can open pagebuilder editor with the following url template with the page or template id in the query string: https://YOURORG.arcpublishing.com/pagebuilder/editor/curate?p=PAGEID

-n Feature name (required, min 2 characters)

-c Output in CSV format

📁 all-content-sources-usage.sh

List of all content sources, from feature block configurations.

-c Output in CSV format

🔍 find-features-by-content-source.sh -n <content_source> [-c]

Produces list of features which uses a specific content source name (like match)

-n Content source filter (required, min 2 characters)

-c Output in CSV format

📁 all-content-sources-resolvers.sh [-c]

List of all content sources, from route resolver configurations.

-c Output in CSV format

🔍 find-resolvers-by-content-source.sh -n <content_source> [-c]

Produces list of resolvers which uses a specific content source name (exact match)

-n Content source name (required, min 2 characters)

-c Output in CSV format

📁 all-page-urls.sh [-c]

Excludes templates, as they are powered by dynamic URL patterns from resolvers and are not included in this script's output.

-c Output in CSV format

🔍 find-pages-by-uri.sh -u <uri_filter> [-c]

List all pages matching URI containing the provided filter.

-u URI filter (required, min 2 characters)

-c Output in CSV format

📦 view-page-and-template.sh [-c]

Select all from view_page_and_template

-c Output in CSV format

📦 view-rendering.sh [-c]

Select all from view_rendering

-c Output in CSV format

📦 view-resolver.sh [-c]

Select all from view_resolver

-c Output in CSV format

About

Scripts to analyze PageBuilder database locally

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Shell 100.0%