Update your Obsidian clippings to YAML

Why This Code?

To convert a repo with dataview inline format to YAML and take advantage of new Obisidan property (see changelog

To clip articles to Obsidian, you might have used Obsidian WebClipper offered by Steph Ango, the CEO of Obsidian, or derivated work.

In this case, you might have ended with files that have the following structure :

author:: XXX
source:: URL link 
clipped:: DateOfClipping
published:: DateOfPublication 

#clippings

But now you want something more like :

---
category: "[[Clippings]]"
author: XXX scrapped from the META tags of the page at URL   
title: Title scrapped from the META tags of the page at URL   
source: URL link
clipped: DateOfClipping
description: Description scrapped from the META tags of the page at URL   
summary: ""  (Some space to put the summary later)
tags:
  - AI
  - other tags taken from the META keyword tag
publish: false

Note that I have also updated the official webclipper to give a consistent result, here is my JS version : WebClipper

How to Install

Clone the repository, go into the repo then install the packages :

npm install

How to Use

First remember to back up your vault before running.

Place all the Markdown files you want to process in a Ressources Subdirectory of your vault. Then do a symbolic link to this subdiretory from within the project repo :
```
cd ObsidianRepoUpdate
ln -s -v  PATH_TO_YOUR_RESSOURCEDIR ./Ressources
```
Run the script:
```
node index.js
```
Processed files will be moved to the Ressources/Processed directory.
New Markdown files with the fetched article content will be generated in the Ressources/Result directory.
Files that could not be processed will be moved to the Ressources/ToProcessManually directory.

Notes:

Check the error.log file for any errors that may have occurred during the process
You still need to process around 10/20% of files manually
I have noticed sometimes the WebClipper doesn't produce very clean image links. fixMarkdown.js is an attempt to fix this.

What Does It Do?

The index.js script performs the following tasks:

Reads markdown files from the Ressources directory.
Extracts URLs from each markdown file that it finds after the "source", "src" or "url"
Re-implement a version of the WebClipper to recreate a new markdown file.
Writes the newly generated markdown content into a Result sub-directory within Ressources.
Moves processed files into a Processed sub-directory within Ressources.
If a URL is not found, moves the original file into a ToProcessManually sub-directory within Ressources.
Logs any further errors that occur to error.log.

License

This project is licensed under the MIT License. However other piece of codes are subject to specific licenses :

Readability.js by Mozilla (Licensed under Apache License Version 2.0)
Turndown by Dom Christie Licensed under MIT License)

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.gitignore		.gitignore
README.md		README.md
fixMarkdown.js		fixMarkdown.js
index.js		index.js
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.gitignore

.gitignore

README.md

README.md

fixMarkdown.js

fixMarkdown.js

index.js

index.js

package-lock.json

package-lock.json

package.json

package.json

Repository files navigation

Update your Obsidian clippings to YAML

Why This Code?

How to Install

How to Use

Notes:

What Does It Do?

License

About

Releases

Packages

Languages

jppaolim/ObsidianClippingsUpdate

Folders and files

Latest commit

History

Repository files navigation

Update your Obsidian clippings to YAML

Why This Code?

How to Install

How to Use

Notes:

What Does It Do?

License

About

Resources

Stars

Watchers

Forks

Languages