Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing sharepoint drive items #844

Closed
mboret opened this issue Apr 11, 2024 · 2 comments
Closed

Missing sharepoint drive items #844

mboret opened this issue Apr 11, 2024 · 2 comments
Labels

Comments

@mboret
Copy link

mboret commented Apr 11, 2024

Hi,

I'm trying to list all items (sharepoint documents) under a specific Site but I can't get more than 200 items (my site has 395 docs). The pagination doesn't seem to work.

Even adding top(1000) doesn't work.

How can I get a list of all objects (drive items) for my Site?

import io
import os
from typing import Any

from office365.graph_client import GraphClient  # type: ignore
from office365.onedrive.driveitems.driveItem import DriveItem  # type: ignore
from office365.onedrive.sites.site import Site  # type: ignore


class SharepointConnector():
    def __init__(
        self,
    ) -> None:
        self.graph_client: GraphClient | None = None

    def load_credentials(self, credentials: dict[str, Any]) -> dict[str, Any] | None:
        aad_client_id = credentials["aad_client_id"]
        aad_client_secret = credentials["aad_client_secret"]
        aad_directory_id = credentials["aad_directory_id"]

        def _acquire_token_func() -> dict[str, Any]:
            """
            Acquire token via MSAL
            """
            authority_url = f"https://login.microsoftonline.com/{aad_directory_id}"
            app = msal.ConfidentialClientApplication(
                authority=authority_url,
                client_id=aad_client_id,
                client_credential=aad_client_secret,
            )
            token = app.acquire_token_for_client(
                scopes=["https://graph.microsoft.com/.default"]
            )
            return token

        self.graph_client = GraphClient(_acquire_token_func)


    def get_all_driveitem_objects(
        self,
    ) -> list[DriveItem]:
        site_object = self.graph_client.sites.get_by_url("https://XXXX.sharepoint.com/sites/library")

        driveitem_list = []
        site_list_objects = site_object.lists.get().execute_query()
        
        for site_list_object in site_list_objects:
            try:
                query = site_list_object.drive.root.get_files(True)
                driveitems = query.execute_query()
                driveitem_list.extend(driveitems)
            except Exception as e:
                # Sites include things that do not contain .drive.root so this fails
                # but this is fine, as there are no actually documents in those
                pass

        return driveitem_list

if __name__ == '__main__':
    sc = SharepointConnector()
    sc.load_credentials(
        {
            "aad_client_id": os.environ["AAD_CLIENT_ID"],
            "aad_client_secret": os.environ["AAD_CLIENT_SECRET"],
            "aad_directory_id": os.environ["AAD_CLIENT_DIRECTORY_ID"],
        }
    )
    sc.get_all_driveitem_objects()
@vgrem
Copy link
Owner

vgrem commented May 6, 2024

Greetings,
thank you for catching this issue and providing the detailed info!

Even adding top(1000) doesn't work.

indeed, it was not honored in DriveItem.get_filles(recursive, page_size) method. In a new version (2.5.9) the the method is expected to return all the items, no matter whether collection exceeds the default page size or not.

Example:

site = client.sites.get_by_url(site_url)
items = site.lists["Documents"].drive.root.get_files(True, 1000).execute_query()
print("{0} files found".format(len(items)))

@vgrem vgrem closed this as completed May 6, 2024
@mboret
Copy link
Author

mboret commented May 22, 2024

Hi @vgrem,

Sorry for the delay. I confirm it works now, with version 2.5.9 I see all my docs.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants