Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Desktop: Add support for OneNote importer #10255

Closed
wants to merge 63 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
d8fbf7c
bringing converter to project
pedr Apr 1, 2024
6650d5c
making it work on desktop
pedr Apr 1, 2024
947835e
allow to execute more than one time
pedr Apr 1, 2024
d184ece
using workspaces to link the dependency
pedr Apr 3, 2024
34e067b
process zip file from onenote exporter
pedr Apr 3, 2024
e9ec561
improving ux of developer with onenote dependecy
pedr Apr 4, 2024
9d08fd1
Merge remote-tracking branch 'upstream/dev' into add-onenote-parser-lib
pedr Apr 4, 2024
472399c
remove workflow files
pedr Apr 4, 2024
d03ba41
allow to import more than one notebok at the time
pedr Apr 4, 2024
50d6d33
Merge remote-tracking branch 'upstream/dev' into add-onenote-parser-lib
pedr Apr 6, 2024
f96b3b1
moving build step to package script
pedr Apr 6, 2024
8e1ae98
remove generated files from eslint rule
pedr Apr 6, 2024
51a7619
add generated file to correct eslint ignore
pedr Apr 6, 2024
221be30
add lib dependecy to Dockerfile
pedr Apr 8, 2024
112e3f4
add rust dependecy
pedr Apr 8, 2024
644c8cc
Merge branch 'dev' into add-onenote-parser-lib
laurent22 Apr 10, 2024
fa728f1
adding onenote-converter build package to version control
pedr Apr 10, 2024
0cc1e4b
cleaning up logs
pedr Apr 10, 2024
1196e10
fix a bug where files that don't exist are trying to be read
pedr Apr 10, 2024
e0f3687
add wasm-pack as a dependecy to speed up building
pedr Apr 10, 2024
0702216
include rust analyzer to rust package on vscode
pedr Apr 11, 2024
0ea2e14
Merge remote-tracking branch 'refs/remotes/origin/add-onenote-parser-…
pedr Apr 11, 2024
0a4a7c8
running cargo fmt
pedr Apr 11, 2024
ef21ac1
refactoring log macro
pedr Apr 17, 2024
4ec43d8
new build artfects
pedr Apr 17, 2024
f49887f
add a new html importer to modify onenote embed links
pedr Apr 18, 2024
c13f313
a better implementation of embed to anchor
pedr Apr 18, 2024
7ab9876
removing duplicated implementation
pedr Apr 18, 2024
73f6ef2
changing error type to log information about the type error
pedr Apr 30, 2024
b06c4df
fix problem when trying to find files that don't exist/are too long
pedr Apr 30, 2024
cee6389
implementing read dir in js
pedr Apr 30, 2024
124a7ad
trying to find a workaround for the line_spacing implementation
pedr Apr 30, 2024
e9ef6d9
fixing width to emulate onenote page
pedr Apr 30, 2024
4560d9a
line height implementation
pedr May 3, 2024
923e4b6
wasm build artifacts
pedr May 3, 2024
4f1a3e4
fmt changes
pedr May 3, 2024
3e8d229
remove exit call
pedr May 3, 2024
0fe80b7
updating artifact
pedr May 3, 2024
c28eee1
adding first test
pedr May 3, 2024
d5349eb
testing subpage indentation
pedr May 6, 2024
1722da5
Merge remote-tracking branch 'upstream/dev' into add-onenote-parser-lib
pedr May 6, 2024
7938646
fixing logs calls
pedr May 8, 2024
d5522d0
bring back error instead of panicking
pedr May 8, 2024
c002b72
make subsections work as noteboks
pedr May 9, 2024
c897e3b
fixing starting point to toc if available, else use .one files
pedr May 9, 2024
990dd54
fixing type error
pedr May 9, 2024
0deb4fc
allowing injection of id generator in BaseModel
pedr May 9, 2024
b7059e7
new build artifact
pedr May 9, 2024
85d1d40
adding basic snapshot testing
pedr May 9, 2024
8da0b30
disable logs in non dev builds
pedr May 9, 2024
ab19916
transform svg node into base64 images
pedr May 10, 2024
186dcb8
partially correct, in some cases pages are created not in the correct…
pedr May 20, 2024
afaa3eb
adding fallback paragraph style data
pedr May 21, 2024
6c615bc
formatting code
pedr May 21, 2024
5dd4820
making .one files work without a onetoc2 file
pedr May 21, 2024
0bb4684
improving logs
pedr May 21, 2024
9de766c
.onetoc2 should only exist on local onenote
pedr May 21, 2024
6e3fbae
updating tests
pedr May 21, 2024
f215b04
adding one more tests about group sections
pedr May 21, 2024
11106a7
removing .one files from recycle bin
pedr May 21, 2024
a151cdb
removing err since we are actually supporting section groups
pedr May 21, 2024
8f32d2a
updating artifact
pedr May 21, 2024
4ef75ef
add readme.md
pedr May 24, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .eslintignore
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,7 @@ plugin_types/
readme/
packages/react-native-vosk/lib/
packages/lib/countable/Countable.js
packages/onenote-converter/pkg/onenote_converter.js

# AUTO-GENERATED - EXCLUDED TYPESCRIPT BUILD
packages/app-cli/app/LinkSelector.js
Expand Down Expand Up @@ -1002,6 +1003,8 @@ packages/lib/services/interop/InteropService_Importer_Md.test.js
packages/lib/services/interop/InteropService_Importer_Md.js
packages/lib/services/interop/InteropService_Importer_Md_frontmatter.test.js
packages/lib/services/interop/InteropService_Importer_Md_frontmatter.js
packages/lib/services/interop/InteropService_Importer_OneNote.test.js
packages/lib/services/interop/InteropService_Importer_OneNote.js
packages/lib/services/interop/InteropService_Importer_Raw.test.js
packages/lib/services/interop/InteropService_Importer_Raw.js
packages/lib/services/interop/Module.test.js
Expand Down
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -982,6 +982,8 @@ packages/lib/services/interop/InteropService_Importer_Md.test.js
packages/lib/services/interop/InteropService_Importer_Md.js
packages/lib/services/interop/InteropService_Importer_Md_frontmatter.test.js
packages/lib/services/interop/InteropService_Importer_Md_frontmatter.js
packages/lib/services/interop/InteropService_Importer_OneNote.test.js
packages/lib/services/interop/InteropService_Importer_OneNote.js
packages/lib/services/interop/InteropService_Importer_Raw.test.js
packages/lib/services/interop/InteropService_Importer_Raw.js
packages/lib/services/interop/Module.test.js
Expand Down
1 change: 1 addition & 0 deletions Dockerfile.server
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ COPY packages/renderer ./packages/renderer
COPY packages/tools ./packages/tools
COPY packages/utils ./packages/utils
COPY packages/lib ./packages/lib
COPY packages/onenote-converter ./packages/onenote-converter
COPY packages/server ./packages/server

# For some reason there's both a .yarn/cache and .yarn/berry/cache that are
Expand Down
3 changes: 3 additions & 0 deletions joplin.code-workspace
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,9 @@
},
],
"settings": {
"rust-analyzer.linkedProjects": [
"./packages/onenote-converter/Cargo.toml",
],
"files.exclude": {
"_mydocs/mdtest/": true,
"_releases/": true,
Expand Down
1 change: 1 addition & 0 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
"name": "root",
"private": true,
"workspaces": [
"packages/onenote-converter/pkg",
"packages/*"
],
"repository": {
Expand Down
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
13 changes: 11 additions & 2 deletions packages/lib/BaseModel.ts
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
import paginationToSql from './models/utils/paginationToSql';
import Database from './database';
import uuid from './uuid';
import time from './time';
import JoplinDatabase, { TableField } from './JoplinDatabase';
import { LoadOptions, SaveOptions } from './models/utils/types';
import ActionLogger, { ItemActionType as ItemActionType } from './utils/ActionLogger';
import { SqlQuery } from './services/database/types';
import uuid from './uuid';
const Mutex = require('async-mutex').Mutex;

// New code should make use of this enum
Expand Down Expand Up @@ -80,6 +80,8 @@ class BaseModel {
['TYPE_COMMAND', ModelType.Command],
];

private static uuidGenerator: ()=> string = uuid.create;

public static TYPE_NOTE = ModelType.Note;
public static TYPE_FOLDER = ModelType.Folder;
public static TYPE_SETTING = ModelType.Setting;
Expand Down Expand Up @@ -573,7 +575,7 @@ class BaseModel {

if (options.isNew) {
if (this.useUuid() && !o.id) {
modelId = uuid.create();
modelId = this.generateUuid();
o.id = modelId;
}

Expand Down Expand Up @@ -754,6 +756,13 @@ class BaseModel {
return this.db_;
}

public static generateUuid() {
return this.uuidGenerator();
}

public static setIdGenerator(generator: ()=> string) {
this.uuidGenerator = generator;
}
// static isReady() {
// return !!this.db_;
// }
Expand Down
1 change: 0 additions & 1 deletion packages/lib/htmlUtils.ts
Original file line number Diff line number Diff line change
Expand Up @@ -172,7 +172,6 @@ class HtmlUtils {

return output.join(' ');
}

}

export default new HtmlUtils();
Expand Down
3 changes: 3 additions & 0 deletions packages/lib/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
},
"devDependencies": {
"@testing-library/react-hooks": "8.0.1",
"@types/adm-zip": "0.5.5",
"@types/fs-extra": "11.0.4",
"@types/jest": "29.5.8",
"@types/js-yaml": "4.0.9",
Expand Down Expand Up @@ -44,11 +45,13 @@
"@joplin/fork-sax": "^1.2.55",
"@joplin/fork-uslug": "^1.0.16",
"@joplin/htmlpack": "~3.0",
"@joplin/onenote-converter": "0.0.1",
"@joplin/renderer": "~3.0",
"@joplin/turndown": "^4.0.73",
"@joplin/turndown-plugin-gfm": "^1.0.55",
"@joplin/utils": "~3.0",
"@types/nanoid": "3.0.0",
"adm-zip": "0.5.12",
"async-mutex": "0.4.1",
"base-64": "1.0.0",
"base64-stream": "1.0.0",
Expand Down
9 changes: 9 additions & 0 deletions packages/lib/services/interop/InteropService.ts
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ import InteropService_Exporter_Md_frontmatter from './InteropService_Exporter_Md
import InteropService_Importer_Base from './InteropService_Importer_Base';
import InteropService_Exporter_Base from './InteropService_Exporter_Base';
import Module, { dynamicRequireModuleFactory, makeExportModule, makeImportModule } from './Module';
import InteropService_Importer_OneNote from './InteropService_Importer_OneNote';
const { sprintf } = require('sprintf-js');
const { fileExtension } = require('../../path-utils');
const EventEmitter = require('events');
Expand Down Expand Up @@ -133,6 +134,14 @@ export default class InteropService {
isNoteArchive: false, // Tells whether the file can contain multiple notes (eg. Enex or Jex format)
description: _('Text document'),
}, () => new InteropService_Importer_Md()),

makeImportModule({
format: 'zip',
fileExtensions: ['zip'],
sources: [FileSystemItem.File],
isNoteArchive: false, // Tells whether the file can contain multiple notes (eg. Enex or Jex format)
description: _('OneNote Notebook'),
}, () => new InteropService_Importer_OneNote()),
];

const exportModules = [
Expand Down
7 changes: 4 additions & 3 deletions packages/lib/services/interop/InteropService_Importer_Md.ts
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ import htmlUtils from '../../htmlUtils';
import { unique } from '../../ArrayUtils';
const { pregQuote } = require('../../string-utils-common');
import { MarkupToHtml } from '@joplin/renderer';
import { isDataUrl } from '@joplin/utils/url';
import { isDataUrl, isMailTo, isFilenameTooLong } from '@joplin/utils/url';
import { stripBom } from '../../string-utils';

export default class InteropService_Importer_Md extends InteropService_Importer_Base {
Expand Down Expand Up @@ -108,11 +108,12 @@ export default class InteropService_Importer_Md extends InteropService_Importer_
let updated = md;
const markdownLinks = markdownUtils.extractFileUrls(md);
const htmlLinks = htmlUtils.extractFileUrls(md);
const fileLinks = unique(markdownLinks.concat(htmlLinks));
const pdfLinks = htmlUtils.extractPdfUrls(md);
const fileLinks = unique(markdownLinks.concat(htmlLinks).concat(pdfLinks));
for (const encodedLink of fileLinks) {
const link = decodeURI(encodedLink);

if (isDataUrl(link)) {
if (isDataUrl(link) || isMailTo(link) || isFilenameTooLong(link)) {
// Just leave it as it is. We could potentially import
// it as a resource but for now that's good enough.
} else {
Expand Down
129 changes: 129 additions & 0 deletions packages/lib/services/interop/InteropService_Importer_OneNote.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,129 @@
import Note from '../../models/Note';
import Folder from '../../models/Folder';
import * as fs from 'fs-extra';
import { createTempDir, setupDatabaseAndSynchronizer, supportDir, switchClient } from '../../testing/test-utils';
import { NoteEntity } from '../database/types';
import InteropService_Importer_OneNote from './InteropService_Importer_OneNote';
import { MarkupToHtml } from '@joplin/renderer';
import BaseModel from '../../BaseModel';
import uuid from '../../uuid';

describe('InteropService_Importer_OneNote', () => {
let tempDir: string;
async function importNote(path: string) {
const newFolder = await Folder.save({ title: 'folder' });
const importer = new InteropService_Importer_OneNote();
await importer.init(path, {
format: 'md',
outputFormat: 'md',
path,
destinationFolder: newFolder,
destinationFolderId: newFolder.id,
});
importer.setMetadata({ fileExtensions: ['md'] });
await importer.exec({ warnings: [] });
const allNotes: NoteEntity[] = await Note.all();
return allNotes;
}
beforeEach(async () => {
await setupDatabaseAndSynchronizer(1);
await switchClient(1);
tempDir = await createTempDir();
});
afterEach(async () => {
await fs.remove(tempDir);
});
it('should import a simple OneNote notebook', async () => {
const notes = await importNote(`${supportDir}/onenote/simple_notebook.zip`);
const folders = await Folder.all();

expect(notes.length).toBe(2);
const mainNote = notes[0];

expect(folders.length).toBe(3);
const parentFolder = folders.find(f => f.id === mainNote.parent_id);
expect(parentFolder.title).toBe('Section title');
expect(folders.find(f => f.id === parentFolder.parent_id).title).toBe('Simple notebook');

expect(mainNote.title).toBe('Page title');
expect(mainNote.markup_language).toBe(MarkupToHtml.MARKUP_LANGUAGE_HTML);
expect(mainNote.body).toMatchSnapshot(mainNote.title);
});

it('should preserve indentation of subpages in Section page', async () => {
const notes = await importNote(`${supportDir}/onenote/subpages.zip`);

const sectionPage = notes.find(n => n.title === 'Section');
const menuHtml = sectionPage.body.split('<ul>')[1].split('</ul>')[0];
const menuLines = menuHtml.split('</li>');

const pageTwo = notes.find(n => n.title === 'Page 2');
expect(menuLines[3].trim()).toBe(`<li class="l1"><a href=":/${pageTwo.id}" target="content" title="Page 2">${pageTwo.title}</a>`);

const pageTwoA = notes.find(n => n.title === 'Page 2-a');
expect(menuLines[4].trim()).toBe(`<li class="l2"><a href=":/${pageTwoA.id}" target="content" title="Page 2-a">${pageTwoA.title}</a>`);

const pageTwoAA = notes.find(n => n.title === 'Page 2-a-a');
expect(menuLines[5].trim()).toBe(`<li class="l3"><a href=":/${pageTwoAA.id}" target="content" title="Page 2-a-a">${pageTwoAA.title}</a>`);

const pageTwoB = notes.find(n => n.title === 'Page 2-b');
expect(menuLines[7].trim()).toBe(`<li class="l2"><a href=":/${pageTwoB.id}" target="content" title="Page 2-b">${pageTwoB.title}</a>`);
});

it('should created subsections', async () => {
const notes = await importNote(`${supportDir}/onenote/subsections.zip`);
const folders = await Folder.all();

const parentSection = folders.find(f => f.title === 'Group Section 1');
const subSection = folders.find(f => f.title === 'Group Section 1-a');
const subSection1 = folders.find(f => f.title === 'Subsection 1');
const subSection2 = folders.find(f => f.title === 'Subsection 2');
const notesFromParentSection = notes.filter(n => n.parent_id === parentSection.id);

expect(parentSection.id).toBe(subSection1.parent_id);
expect(parentSection.id).toBe(subSection2.parent_id);
expect(parentSection.id).toBe(subSection.parent_id);
expect(folders.length).toBe(7);
expect(notes.length).toBe(6);
expect(notesFromParentSection.length).toBe(2);
});

it('should expect notes to be rendered the same', async () => {
let idx = 0;
BaseModel.setIdGenerator(() => String(idx++));
const notes = await importNote(`${supportDir}/onenote/complex_notes.zip`);

for (const note of notes) {
expect(note.body).toMatchSnapshot(note.title);
}
BaseModel.setIdGenerator(uuid.create);
});

it('should render the proper tree for notebook with group sections', async () => {
const notes = await importNote(`${supportDir}/onenote/group_sections.zip`);
const folders = await Folder.all();

const mainFolder = folders.find(f => f.title === 'Notebook created on OneNote App');
const section = folders.find(f => f.title === 'Section');
const sectionA1 = folders.find(f => f.title === 'Section A1');
const sectionA = folders.find(f => f.title === 'Section A');
const sectionB1 = folders.find(f => f.title === 'Section B1');
const sectionB = folders.find(f => f.title === 'Section B');
const sectionD1 = folders.find(f => f.title === 'Section D1');
const sectionD = folders.find(f => f.title === 'Section D');

expect(section.parent_id).toBe(mainFolder.id);
expect(sectionA.parent_id).toBe(mainFolder.id);
expect(sectionD.parent_id).toBe(mainFolder.id);

expect(sectionA1.parent_id).toBe(sectionA.id);
expect(sectionB.parent_id).toBe(sectionA.id);

expect(sectionB1.parent_id).toBe(sectionB.id);
expect(sectionD1.parent_id).toBe(sectionD.id);

expect(notes.filter(n => n.parent_id === sectionA1.id).length).toBe(2);
expect(notes.filter(n => n.parent_id === sectionB1.id).length).toBe(2);
expect(notes.filter(n => n.parent_id === sectionD1.id).length).toBe(1);
});
});
70 changes: 70 additions & 0 deletions packages/lib/services/interop/InteropService_Importer_OneNote.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
import { ImportExportResult, ImportModuleOutputFormat } from './types';

import InteropService_Importer_Base from './InteropService_Importer_Base';
import { NoteEntity } from '../database/types';
import { rtrimSlashes } from '../../path-utils';
import { oneNoteConverter } from '@joplin/onenote-converter';
import * as AdmZip from 'adm-zip';
import InteropService_Importer_Md from './InteropService_Importer_Md';
import { join, resolve } from 'path';
import Logger from '@joplin/utils/Logger';
import path = require('path');

const logger = Logger.create('InteropService_Importer_OneNote');

export default class InteropService_Importer_OneNote extends InteropService_Importer_Base {
protected importedNotes: Record<string, NoteEntity> = {};

private getEntryDirectory(unzippedPath: string, entryName: string) {
const withoutBasePath = entryName.replace(unzippedPath, '');
return path.normalize(withoutBasePath).split(path.sep)[0];
}

public async exec(result: ImportExportResult) {
const sourcePath = rtrimSlashes(this.sourcePath_);
const unzipTempDirectory = await this.temporaryDirectory_(true);
const zip = new AdmZip(sourcePath);
logger.info('Unzipping files...');
zip.extractAllTo(unzipTempDirectory, false);

const files = zip.getEntries();
if (files.length === 0) {
result.warnings.push('Zip file has no files.');
return result;
}

// files that don't have a name seems to be local only and shouldn't be processed

const tempOutputDirectory = await this.temporaryDirectory_(true);
const baseFolder = this.getEntryDirectory(unzipTempDirectory, files[0].entryName);
const notebookBaseDir = path.join(unzipTempDirectory, baseFolder, path.sep);
const outputDirectory2 = path.join(tempOutputDirectory, baseFolder);

const notebookFiles = zip.getEntries().filter(e => e.name !== '.onetoc2' && e.name !== 'OneNote_RecycleBin.onetoc2');

logger.info('Extracting OneNote to HTML');
for (const notebookFile of notebookFiles) {
const notebookFilePath = join(unzipTempDirectory, notebookFile.entryName);
try {
await oneNoteConverter(notebookFilePath, resolve(outputDirectory2), notebookBaseDir);
} catch (error) {
console.error(error);
}
}

logger.info('Importing HTML into Joplin');
const importer = new InteropService_Importer_Md();
importer.setMetadata({ fileExtensions: ['html'] });
await importer.init(tempOutputDirectory, {
...this.options_,
format: 'html',
outputFormat: ImportModuleOutputFormat.Html,

});
logger.info('Finished');
result = await importer.exec(result);

// remover temp directories?
return result;
}
}
Loading
Loading